|
|
Tuesday, 03 June 2008
|
Preamble NOTE: The structure of sample database and a version of the main bases used are available in the "The sample database" If the RDBMS have an optimizer, it does not extradite him to mislead or to be constrained by the shackles of your expression of complaint. In addition optimizer is internal RDBMS, it has no influence on the RDBMS itself, the machine or network infrastructure, crucial in terms of rapid processing of data flows. Here, then, point by point, the elements that must be taken into consideration for "booster" your RDBMS and execution of SQL! 1. The environment The computing environment, ie the overall architecture of SI, has a decisive influence on the performance of RDBMS and hence the speed at which your users will access the data. Indispensable is a dedicated server for the RDBMS especially if it is a Client / Server. The useful is to set the OS server on which works RDBMS so that the server has adequate resources.
It will make adequate access devices based on the number of simultaneous users and the burden they induce. The best is a deterministic network, qu'hélas Ethernet is not. But we can for example use several network cards for the same server for users or groups under separate networks. You can also use devices such as accelerating the speed of the optical fiber connection, or switch rather than hubs. In the latter case, switch administrables present a major interest, that can spread the load ... Below is a table which can draw: | simultaneous users | 15 | 15 | 50 | 50 | 300 | 300 | | flow | weak | strong | weak | strong | weak | strong | | architecture | 10 Mb / s, 1 NIC | 100 Mb / s, 1 NIC | 100 Mb / s, 1 to 2 NICs | 100 Mb / s, 1 to 4 NICs + switch front | 1 Gb / s, 1 NIC FO + switch front | 1 Gb / s, 2 to 4 network cards in front FO + switch cascadé | FOR: fiber optics Beyond that, load balancing through clustering techniques. In this case enter into account the phenomena of replication and it is better to provide a development mode "client server disconnected." Addresses of fixed network resources are preferred. WHAT MAKES WINNING TIME ... A network infrastructure suited to the load, where the number of collision will be minimal, see no ... That is to say: ♦The fast network interfaces (fiber, switch ...) ♦Sub networks parrallélisés ♦A deterministic network protocol (100 VG anylan, SPX / IPX ...) ♦The distribution of the burden by administrables interface devices and / or set of servers 2. The physical server or "machine" A RDBMS not working properly on a machine that is dedicated to him. Any other solution is to outlaw exploitation. The machine is a PC-type server.
ATTENTION: a basic PC can in no case be converted to server because the internal architecture of a server has nothing to do with a PC for personal use. Thus a server is built for parallel data flows and access to resources quickly disc, while a PC is rather designed to quickly process graphics routines ... It will verify that the server chooses has at least the following features and equipment: ♦Redundancy, distributing the load and extracting hot (hot swap and hot-plug) ♦Autocorrective RAM (ECC) ♦Maps redundant network (IPSEC) ♦Ripple Electric (UPS) ♦Hard Drives redundant (RAID) ♦System backup data on external support physical The redundant power supply can continue operating the server despite a failure in this system, and its spare. It is particularly recommended to have a food emergency troubleshooting. The RAM autocorrective ECC (Error Checking and Correction) is a memory that isolates the parties abimée of memory space and prohibits access dynamically, which prevents the occurrence of errors. It must be in sufficient quantity compared to the size of the base. It is not uncommon to define a size of RAM at least 256 MB + size of the database for the server. For example, if your database is 800 MB RAM of more than 1 gigabyte is desirable. Thus almost all the data provided will be mounted in memory to ensure an access time well below the reading of the disc. It has every interest to be at least one network card redundant and assigning a logical address further. It's easier when the card fails to divert traffic to a card already installed to turn off the server to replace the card! A UPS is an electronic device to overcome a lack of power for a limited time. Contrary to popular belief, inverters are not intended to filter the electrical current and normally do not protect the surge or parasites. This requires an additional device to the inverter, which is why there are inverters on-line and off-line. For servers should always use a UPS "on line" with a logic output to inform the server of a power outage, so the extinction of the server takes place under good conditions, or that ' a generator, which must take the relay, is launched NOTE: The heel achille a server and are hard on these discs are capital "knowledge" of the company, ie. Most hardware failures of computer systems come from power supplies and hard disks of PCs which operate 24heures/24 ... So on these issues which should be the most demanding. Regarding the records, all publishers RDBMS a little serious, recommend that a technology-type disk controller RAID SCSI. Rapellons the definition of RAID (Redundant Array of Inexpensive Disks) a set of discs in lequelle part of the physical capacity is used to store information redundant. This information allows redundant regeneration of lost data. The RAID systems are characterized by different levels, the most famous are numbered from 0 to 10 (actually, 0, 1, 3, 4, 5, 6 7 and 10). The level RAID 5 is a good security combined with a good tolerance for failure, since in principle, a damaged disk can be replaced "hot", ie without interrupting the server. The SCSI (Small Computer System Interface) is particularly interesting because it is a "smart" management of disk space and its earlier briefs asking the processor significantly less than conventional technologies such as 'FDI. The flow of SCSI devices is more important and it is multitasking while FDI is mono task. A physical device data backup: a bad habit to believe that it consists simply acquire technology for RAID to be free of any problem and thus make the deadlock on a back-up. It is a heresy! A backup requires genuine relocation of data to guard against any serious problem (water damage, fire, malice ...) but any incident of exploitation such as the removal misguided or malicious data. NOTE: we must not confuse the logic of backup data that most editors RDBMS offer in their package and a backup physical. Indeed, the RDBMS operating permanently it is often not possible to "close" the file constituting the basis of data for copy at a backup. To compensate, publishers offer a mechanism that writes a data set includes the base in an external file for storage purposes and archiving (backup logic). It should always use this principle associated with physical backup. ATTENTION: we must never exceed an occupancy rate of about 67% of disk space. Indeed, in operation, your database "magnified" and plays a more and more on the disc. Over the disc is completed and more access times are long. The phenomenon is not linear. If it is insensitive when the disc is filled low, it can become sensitive during readings when the drive reaches saturation. It may even totally block the RDBMS in case of changes. Some OS network can set up alerts administrative cases exceeded quotas of disk space. They are absolutely necessary for a serene and efficient exploitation of RDBMS. WHAT MAKES WINNING TIME ... ♦A PC that is a true server ♦From the RAM in sufficient quantity ♦The SCSI ♦An occupancy rate of disk space still below 67% 3. The server logic or RDBMS We have already said that the RDBMS must be installed on a dedicated machine. It is even advisable to opt only basic operating a machine. Thus under the organization of your computer system, if you have a database "front office" and another "back office" you must opt for two independent machines (think simply to take exactly the same, because failure of one it is easier to restart all services on the other ...). The parameters of installing the server can have a significant impact on response times but also on the behaviour of your queries. For example, the choice of a set of characters combined with a snack compatible with the sort is binary will be much faster and effective than snack insensitive and diacriticals. Read about this "A Question of Character". But all these parameters are specific to each publisher it is difficult to say more ... WHAT MAKES WINNING TIME ... ♦A base operated on a dedicated machine ♦Parameters RDBMS adapted (snack binary buffer size ...) 4. The database When creating a database it is possible to request a file with a precise size. It should always create the file of the database with the size will have the database during its operation. So if your database must ultimately 3 Go, during the creation of the database, give this value as file size. The devices self adjusting the size of the database are generally wasting time by the fact that they fragment the file constituting the base. Place the data file based on a disc from avoiding the disk system. Place the log file on a disk from avoiding the disk system and one containing data from the base. This delays the saturation disks and saves time because of the parrallelisation tasks of writing the SCSI system on two different physical disk. If you can divide your data based on multiple discs for example by placing the most severe (BLOB) discs less efficient. Choose the size of the data page (the granule base for storing data on your RDBMS). This is calculated based on the size of the adult base, the size of discs and size of the RAM ... It may give an indication the following scale (it is best to consult your RDBMS): if the size recommended by your RDBMS is 8 KB for a few Go with 1 GB of RAM on discs with a 9 GB 32-bit OS ... then: | RAM | 256 MB | 512 MB | 1 Go | 2 GB | 4 GB | | Disc | 4GB | 4GB | 9Go | 18Go | 36GB | | Base | 600 MB | 1 Go | 3 GB | 8 GB | 25 GB | | Page | 2 KB | 4 KB | 8 KB | 16 Ko | 32 KB | Set parameters storage structures depending on the disk space and percentage of updated. For example, structures of trees to store indexes used most often a parameter, the "fill factor" (filling factor), which can be adjusted. Most of the time this "fill factor" is about 92 to 98%. Plus it is low, more access to the index will be fast, but the volume of data held by this index will be important and therefore costly in terms of updating. It's up to you to decide if you want to optimize the readings or entries in the database. WHAT MAKES WINNING TIME ... ♦Create a database with a file size calculated on the volume to adulthood of the base ♦Divide the data files and log on different disks, avoiding the disk system ♦Divide the data across multiple disk, including BLOBS on the discs slower ♦Specify a page size of data appropriate to the physical structure of the system and the volume of data ♦Set parameters storage structures indexes 5. The data model Normalisez maximum ... Create tables smallest possible outsourcing in reference tables all information that could be used several times. Example: Standardize your data types and formats using areas. See define the areas and use ... You will therefore no effort to ask the RDBMS when comparisons on columns of similar content (trantypage implied). Indeed, if you want to compare the client's name defines as VARCHAR (32) on behalf of a prospect defined as NCHAR (25), the RDBMS will provide an extra effort to standardize the types of columns before making the comparison . Choose key whose size is exactly the size of the word processor (eg CHAR (4), INTEGER in a 32-bit OS as Win95, 98, NT, 2000 ....). Prefer key information purely for your joints rather than key natural flow of your analysis. Avoid key composites. Prefer once again a key purely science! Avoid collonnes "nullables" (ie can have a zero value) if the latter must be calculated. In particular, the accounting data and columns representing Boolean data. Prefer save the default 0, if you must use an operator or CASE COALESCE in requests performing an operation arithmetic so that markers NULL are taken into account as 0. Prefer types fixed (CHAR, NCHAR ...) rather than the types variables (VARCHAR. ..) whenever the column will be asked to research and join. Use the maximum constraints provided for in clause CREATE before moving to programming triggers. For example prefer SET ON DELETE NULL and a batch of night for a suppression cascade rather than using a trigger. Model your relationship so powerful. Thus a hierarchy will be advantageously modelled by a tree represented by interval rather than a self reference. Avoid overly complex joints in particular over two tables. Indexez Essentially, that is, the primary keys and foreign columns most solicited and joint research, etc. ... Never columns BLOB or free text! Add your base a table of dates, rather than using math functions temporellles. Dénormalisez wisely if all else failed to in seeking to gain time. WHAT MAKES WINNING TIME ... ♦Normalisez your data, entities and relationships ♦Standardize the size and type of your columns ♦Using keys composed of a single column of a type and purely 32-bit computing (an integer that is perfect) ♦Avoid columns nullables especially if it must be calculated ♦Prefer types fixed (CHAR instead VARCHAR) for columns frequently invited to join research and ♦Use efficient models for your complex relationships (inheritance, trees ....) ♦Indexez basically not superfluous! ♦Add a table of dates rather than using calculation functions time ♦Denormalisez your relationship when all else failed to! 6. In development Allow formatting your data before any integration or updated. For example make sure that a person's name is always capitalized without white parasite neither beginning nor end (TRIM + UPPER). Example - for my part I use a systematic routine formatting, including the following models: | Format | Rules | | DateFr | remove all the characters other than numbers and poses bars DD / MM / YYYY | | Telephone | formate a string of phone number to: 33 1 45 78 45 78 (11 digits) or 01 45 78 45 78 (10 digits) in all other cases: Groups 2 and if odd, then begins with isolated figure 1 | | First name | retains only the characters a to z (U.S.) accent, and diacritical letters and characters [ ',' - ','''] (space, dash, apostrophe) with initial capitalization of | | Mail | Formatting an e-mail address: tiny, acceptance of the characters' _ ',' - ', and'. " ; Mandatory presence of a single @ | | LOGIN | retains only the characters A to Z (capital) and the '_' (underscore) transforms the accents and letters diacritics. All other characters are rejected | | 09 | retains only the characters 0 to 9 with no spaces | | AZ_maj | retains only the characters A to Z (capital), transforms the accents and letters diacritics. All other characters are rejected | | AZ09_maj | retains only the characters A to Z (capital) and 0 to 9, transforms the accents and letters diacritics. All other characters are rejected | | az_min | retains only the characters a to z (U.S.), transforms the accents and letters diacritics. All other characters are rejected | | az09_min | retains only the characters a to z (U.S.) and 0 to 9, transforms the accents and letters diacritics. All other characters are rejected | | AZ_maj_plus | retains only the characters A to Z (capital) and the characters [ ',' - ','''] (space, dash, apostrophe) with no other characters (or accent, nor letters diacritical) | | AZ09_maj_plus | retains only the characters A to Z (capital) and 0 to 9 and characters [ ',' - ','''] (space, dash, apostrophe) with no other characters (or accent, nor letters diacritical) | | az_min_plus | retains only the characters a to z (U.S.), and characters [ ',' - ','''] (space, dash, apostrophe) with no other characters (or accent, nor letters diacritical) | | az09_min_plus | retains only the characters a to z (U.S.) and 0 to 9 t characters [ ',' - ','''] (space, dash, apostrophe) with no other characters (or accent, nor letters diacritical) | | Az09plusplus | converted accents and diacritic letters to non-accented non diacritees | Therefore searches and comparisons do not need to be made with a type setting "insensitive" + "insensitive to the accents," because the data are still formatted the same way. This accelerates research significantly. Prevent duplication seeking before insertion if the line has not already been seized. For proper names, use the phonetic research as Soundex. See Art "Soundex. Ban orphans ... Use referential integrity and triggers to extend it to prevent any line orphan. Index words of your texts long they have to be searched. See indexing text. WHAT MAKES WINNING TIME ... ♦A good formatting data entered and updated search allows faster ♦The lack of duplication avoids double counting and the use of keyword DISTINCT ♦A text indexing avoids lengthy and tedious research 7. Operating Watch the volume of data: the base is too big? Why? The file of the newspaper too big? Can you truncate? The disk space is it consistent enough (at least 33%)? ... If your RDBMS has a optimizer statistics, regularly update the statistics. Reindexez from time to time basis, especially after significant updates a lot. Rearrange your base, in order to maximize the number of pages rellement useful. Watch the load access and the volume of traffic: identify the major consumers and optimize their "pipes". WHAT MAKES WINNING TIME ... ♦Do not get below 33% empty space on disks ♦Update statistics ♦Rearrange your base reindexez regularly and if necessary ♦Watch the load and adpatez pipes 8. Optimiseurs and plan queries Most RDBMS have a optimizer analyzes a logical or statistical how best to execute the request. It is not uncommon that you can intervene on how the RDBMS and optimizing its claim to operate. The way it will operate is called the "plan request" and shows operations that will make simplistic to respond to your request. The RDBMS such as Oracle, MS SQL Server or DB2 have a tool to view this plan. Here is an example of the visualization tool plans queries MS SQL Server:  It is certainly very beautiful, but the main information is missing: the running time of each stage and the number of entry / exit made on the disc ... To this must be done right click on each emblem, which is not very practical ...
According to the publishers, you can intervene directly in terms of application or specify the index to use or you may be allowed to re-write your query differently in order to find the most economical optimization of time and resources ... WHAT MAKES WINNING TIME ... ♦Choose the right plan! 9. Transformations usual Here are some typical changes should be borne in mind in order to optimize your queries. ATTENTION: make sure that the transformation makes a reduction of processing time because it is not always the case and may depend on your indexing, the type of data, parameters of your RDBMS and resources of your machine ... There is no miracle, only tests can convince the effectiveness of your query write this or that way. | No. | AVOID | PREFER | | 1 | Avoid using the star in the SELECT clause ... SELECT * FROM T_CLIENT | ... prefer to appoint a column to a SELECT CLI_ID, TIT_CODE, CLI_NOM, CLI_PRENOM, CLI_ENSEIGNE FROM T_CLIENT | | 2 | Avoid using DISTINCT clause in the SELECT ... SELECT DISTINCT CHB_NUMERO, CHB_ETAGE FROM T_CHAMBRE | ... when it is not necessary SELECT CHB_NUMERO, CHB_ETAGE FROM T_CHAMBRE | | 3 | Do not column in the SELECT clause ... of the complaint under EXISTS ... SELECT CHB_ID FROM T_CHAMBRE T1 WHERE NOT EXISTS (SELECT CHB_ID FROM TJ_CHB_PLN_CLI T2 WHERE PLN_JOUR = '2000-11-11' AND T2.CHB_ID = T1.CHB_ID) | ... use the star or a constant SELECT CHB_ID FROM T_CHAMBRE T1 WHERE NOT EXISTS (SELECT * FROM TJ_CHB_PLN_CLI T2 WHERE PLN_JOUR = '2000-11-11' AND T2.CHB_ID = T1.CHB_ID) | | 4 | do not count a column ... SELECT COUNT (CHB_ID) FROM T_CHAMBRE | ... when it just counting lines SELECT COUNT (*) FROM T_CHAMBRE | | 5 | avoid using LIKE ... SELECT * FROM T_CLIENT WHERE CLI_NOM LIKE 'D%' | ... if a range of research helps SELECT * FROM T_CLIENT WHERE CLI_NOM BETWEEN 'D' AND 'E ' | | 6 | avoid joints in the WHERE ... SELECT * FROM T_CLIENT C, T_FACTURE F WHERE EXTRACT(YEAR FROM F.FAC_DATE) = 2000 AND F.CLI_ID = C.CLI_ID | ... prefer the operator Standard JOIN SELECT * FROM T_CLIENT C JOIN T_FACTURE F ON F.CLI_ID = C.CLI_ID WHERE EXTRACT(YEAR FROM F.FAC_DATE) = 2000 | | 7 | Avoid ranges <and> for discrete values ... SELECT * FROM T_FACTURE WHERE FAC_DATE > '2000-06-18' AND FAC_DATE < '2000-07-15' | ... prefer BETWEEN SELECT * FROM T_FACTURE WHERE FAC_DATE BETWEEN '2000-06-18' AND '2000-07-14' | | 8 | IN avoid with discrete values recouvrantes ... SELECT * FROM T_CHAMBRE WHERE CHB_NUMERO IN (11, 12, 13, 14) | ... prefer BETWEEN SELECT * FROM T_CHAMBRE WHERE CHB_NUMERO BETWEEN 11 AND 14 | | 9 | Avoid using the DISTINCT ... SELECT DISTINCT CLI_NOM, CLI_PRENOM FROM T_CLIENT C JOIN TJ_CHB_PLN_CLI J ON C.CLI_ID = J.CLI_ID WHERE PLN_JOUR = '2000-11-11 | ... if a complaint under EXISTS offers duplication SELECT CLI_NOM, CLI_PRENOM FROM T_CLIENT C WHERE EXISTS (SELECT * FROM TJ_CHB_PLN_CLI J WHERE C.CLI_ID = J.CLI_ID AND PLN_JOUR = '2000-11-11') | | 10 | avoid requests under ... SELECT CHB_ID FROM T_CHAMBRE WHERE CHB_ID NOT IN (SELECT CHB_ID FROM TJ_CHB_PLN_CLI WHERE PLN_JOUR = '2000-11-11') | ... when you can use the joints SELECT DISTINCT C.CHB_ID FROM T_CHAMBRE C LEFT OUTER JOIN TJ_CHB_PLN_CLI P ON C.CHB_ID = P.CHB_ID AND PLN_JOUR = '2000-11-11' WHERE P.CHB_ID IS NULL | | 11 | avoid requests under with IN ... SELECT CHB_ID FROM T_CHAMBRE WHERE CHB_ID NOT IN (SELECT CHB_ID FROM TJ_CHB_PLN_CLI WHERE PLN_JOUR = '2000-11-11') | ... when you can use EXISTS SELECT CHB_ID FROM T_CHAMBRE T1 WHERE NOT EXISTS (SELECT * FROM TJ_CHB_PLN_CLI T2 WHERE PLN_JOUR = '2000-11-11' AND T2.CHB_ID = T1.CHB_ID) | | 12 | COALESCE transform the ... SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) * (1 - COALESCE(LIF_REMISE_POURCENT, 0)/100) - COALESCE(LIF_REMISE_MONTANT, 0) AS TOTAL_LIGNE FROM T_LIGNE_FACTURE | ... UNION SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) FROM T_LIGNE_FACTURE WHERE LIF_REMISE_POURCENT IS NULL AND LIF_REMISE_MONTANT IS NULL UNION SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) - LIF_REMISE_MONTANT FROM T_LIGNE_FACTURE WHERE LIF_REMISE_POURCENT IS NULL AND LIF_REMISE_MONTANT IS NOT NULL UNION SELECT LIF_ID, (LIF_QTE* LIF_MONTANT) * (1 - LIF_REMISE_POURCENT/100) FROM T_LIGNE_FACTURE WHERE LIF_REMISE_POURCENT IS NOT NULL AND LIF_REMISE_MONTANT IS NULL UNION SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) * (1 - LIF_REMISE_POURCENT/100) - LIF_REMISE_MONTANT FROM T_LIGNE_FACTURE WHERE LIF_REMISE_POURCENT IS NOT NULL AND LIF_REMISE_MONTANT IS NOT NULL | | 13 | transform the CASE ... ELECT CHB_NUMERO, CASE CHB_ETAGE WHEN 'RDC' THEN 0 WHEN '1er' THEN 1 WHEN '2e' THEN 2 END AS ETAGE, CHB_COUCHAGE FROM T_CHAMBRE ORDER BY ETAGE, CHB_COUCHAGE | ... UNION SELECT CHB_NUMERO, 0 AS ETAGE, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE = 'RDC' UNION SELECT CHB_NUMERO, 1 AS ETAGE, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE = '1er' UNION SELECT CHB_NUMERO, 2 AS ETAGE, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE = '2e' ORDER BY ETAGE, CHB_COUCHAGE | | 14 | transform the EXCEPT ... SELECT CHB_ID FROM T_CHAMBRE EXCEPT SELECT CHB_ID FROM TJ_CHB_PLN_CLI WHERE PLN_JOUR = '2000-11-11' | ... joints SELECT DISTINCT C.CHB_ID FROM T_CHAMBRE C LEFT OUTER JOIN TJ_CHB_PLN_CLI P ON C.CHB_ID = P.CHB_ID AND PLN_JOUR = '2000-11-11' WHERE P.CHB_ID IS NULL | | 15 | INTERSECT transform the ... SELECT CHB_ID FROM T_CHAMBRE INTERSECT SELECT CHB_ID FROM TJ_CHB_PLN_CLI WHERE PLN_JOUR = '2000-11-11' | to join ... SELECT DISTINCT C.CHB_ID FROM T_CHAMBRE C INNER JOIN TJ_CHB_PLN_CLI P ON C.CHB_ID = P.CHB_ID WHERE PLN_JOUR = '2000-11-11' | | 16 | UNION transform the ... SELECT OBJ_NOM AS NOM, OBJ_PRIX AS PRIX FROM T_OBJET UNION SELECT MAC_NOM AS NOM, MAC_PRIX AS PRIX FROM T_MACHINE ORDER BY NOM, PRIX (complete example is in the technical RDBMS) | to join ... SELECT COALESCE(OBJ_NOM, MAC_NOM) AS NOM, COALESCE(OBJ_PRIX, MAC_PRIX) AS PRIX FROM T_OBJET O FULL OUTER JOIN T_MACHINE M ON O.OBJ_NOM = M.MAC_NOM AND O.OBJ_PRIX = M.MAC_PRIX ORDER BY NOM, PRIX | | 17 | transform sub queries <> ALL ... SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE <> ALL (SELECT CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | ... a NOT IN SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE NOT IN (SELECT CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | | 18 | transform sub queries = ANY ... SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE = ANY (SELECT CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | ... in IN SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE IN (SELECT CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | | 19 | transform sub queries ANY / ALL ... SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE > ALL (SELECT CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | ... in combination with queries and aggregat SELECT CHB_ID, CHB_COUCHAGE FROM T_CHAMBRE WHERE CHB_COUCHAGE > (SELECT MAX(CHB_COUCHAGE) FROM T_CHAMBRE WHERE CHB_ETAGE ='RDC') | | 20 | avoid requests under correlated ... SELECT DISTINCT VILLE_ETP FROM T_ENTREPOT AS ETP1 WHERE NOT EXISTS (SELECT * FROM T_RAYON RYN WHERE NOT EXISTS (SELECT * FROM T_ENTREPOT AS ETP2 WHERE ETP1.VILLE_ETP = ETP2.VILLE_ETP AND (ETP2.RAYON_RYN = RYN.RAYON_RYN))) (complete example is in the relational division ...) | ... prefer sub queries without correlation SELECT DISTINCT VILLE_ETP FROM T_ENTREPOT WHERE RAYON_RYN IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELECT RAYON_RYN FROM T_RAYON))) GROUP BY VILLE_ETP HAVING COUNT (*) = (SELECT COUNT(DISTINCT RAYON_RYN) FROM T_RAYON) | | 21 | avoid requests under correlated ... SELECT FAC_ID, (SELECT MAX(LIF_QTE * LIF_MONTANT) FROM T_LIGNE_FACTURE L WHERE F.FAC_ID = L.FAC_ID) FROM T_FACTURE F ORDER BY FAC_ID | ... prefer joints SELECT F.FAC_ID, MAX(LIF_QTE * LIF_MONTANT) FROM T_FACTURE F JOIN T_LIGNE_FACTURE L ON F.FAC_ID = L.FAC_ID GROUP BY F.FAC_ID ORDER BY F.FAC_ID | | 22 | do not use numbers in the ORDER BY clause ... SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) FROM T_LIGNE_FACTURE ORDER BY 1, 2 | ... preferably specify the names of columns, including a clause SELECT SELECT LIF_ID, (LIF_QTE * LIF_MONTANT) AS LIF_MONTANT FROM T_LIGNE_FACTURE ORDER BY LIF_ID, LIF_MONTANT | NOTE: all transformations do not respond the same way the charge. In other words, depending on the volume of data such and such a transformation may be saving time can be lost when the volume of data increases. CE QUI FAIT GAGNER DU TEMPS... * Turn your queries and choose after tests * Re estimate the cost to be borne 10. Some tips Select the maximum using the WHERE clause, the fewer lines returned, the better the deal quickly RDBMS data. If your RDBMS supports it, add a clause limiting the number of rows returned (TOP, LIMIT ...), especially when you test or devise your code and that the base is operated. Less RDBMS data to be used, the more it will be fast. Plan at least using the SELECT clause and columns named. Less ago columns returned, the better the deal quickly RDBMS your query. Surnommez your tables with aliases as short as possible. Avoid prefix columns not ambiguous. Example: -- Nickname trops long prefixes unnecessary SELECT CLIENT.CLI_ID, CLIENT.CLI_NOM, CLIENT.CLI_PRENOM, TELEPHONE.TEL_NUMERO FROM T_CLIENT CLIENT JOIN T_TELEPHONE TELEPHONE ON CLIENT.CLI_ID = TELEPHONE.CLI_ID -- The flow of characters includes at least 55 characters and is equally understandable. SELECT C.CLI_ID, CLI_NOM, CLI_PRENOM, TEL_NUMERO FROM T_CLIENT C JOIN T_TELEPHONE T ON C.CLI_ID = T.CLI_ID In a batch repetant this request 2 000 times the volume of unnecessary characters would have been 100 K. .. Do not join unnecessary filtering. Example: -- A reference table is used to capture interactive reference - it is redundant in filtering since the code contained in - the mother table (T_TELEPHONE) SELECT C.CLI_ID, CLI_NOM, CLI_PRENOM, TEL_NUMERO FROM T_CLIENT C JOIN T_TELEPHONE T ON C.CLI_ID = T.CLI_ID JOIN T_TYPE TT ON T.TYP_CODE = TT.TYP_CODE WHERE TYP_LIBELLE = 'Phone' -- The mother table (T_TELEPHONE) is filtered directly - the code on the phone type 'fixed' SELECT C.CLI_ID, CLI_NOM, CLI_PRENOM, TEL_NUMERO FROM T_CLIENT C JOIN T_TELEPHONE T ON C.CLI_ID = T.CLI_ID WHERE TYP_CODE = 'TEL' In a speech filtered, always place a single column on one side of the comparison operator. Example: -- Index on the column LIF_QTE or LIF_MONTANT can not be activated T_LIGNE_FACTURE SELECT * FROM T_LIGNE_FACTURE WHERE LIF_QTE + 10 = LIF_MONTANT / 5 -- Index on the column can be activated LIF_QTE SELECT * FROM T_LIGNE_FACTURE WHERE LIF_QTE = LIF_MONTANT / 5 - 10 Do not use a wildcard debut word in the context of a search LIKE. If the need is imperative, add a column containing this string in reverse. Example: CREATE TABLE T_MOT (MOT VARCHAR(25)) INSERT INTO T_MOT (MOT) VALUES ('marchand') INSERT INTO T_MOT (MOT) VALUES ('marcher') INSERT INTO T_MOT (MOT) VALUES ('flamand') SELECT * FROM T_MOT WHERE MOT LIKE '%and' MOT ------------------------- Flemish merchant - the index on the column MOT can be activated ALTER TABLE T_MOT ADD TOM VARCHAR(25) UPDATE T_MOT SET TOM = REVERSE(MOT) -- REVERSE returns the string by reversing the order of letters SELECT * FROM T_MOT WHERE TOM LIKE 'dna%' MOT TOM ------------------------- ------------------------- Merchant dnahcram Flemish dnamalf -- index on the column can be activated TOM Avoid seeking negation (NOT) or difference (<>), prefer looking positive. Example: SELECT * FROM T_FACTURE WHERE FAC_PMT_DATE NOT BETWEEN FAC_DATE AND FAC_DATE + INTEVAL 30 DAY SELECT * FROM T_FACTURE WHERE FAC_PMT_DATE < FAC_DATE OR FAC_PMT_DATE > FAC_DATE + NTEVAL 30 DAY Create views to simplify your queries. Example: -- Research room rates 1, 3 and 5 at the date of 25/12/2000 SELECT CHB_ID, TRF_CHB_PRIX FROM TJ_TRF_CHB WHERE TRF_DATE_DEBUT = (SELECT MAX(TRF_DATE_DEBUT) FROM TJ_TRF_CHB WHERE TRF_DATE_DEBUT < '2000-12-25') AND CHB_ID IN (1, 3, 5) -- View simplifying the presentation intervals validity tariffs V_TARIF_CHAMBRE CREATE VIEW V_TARIF_CHAMBRE AS SELECT CHB_ID, TRF_DATE_DEBUT, COALESCE((SELECT MIN(TRF_DATE_DEBUT) FROM TJ_TRF_CHB T2 WHERE T2.TRF_DATE_DEBUT > T1.TRF_DATE_DEBUT), '2099-12-31') - INTERVAL 1 DAY AS TRF_DATE_FIN, TRF_CHB_PRIX FROM TJ_TRF_CHB T1 SELECT CHB_ID, TRF_CHB_PRIX FROM V_TARIF_CHAMBRE WHERE '2000-12-25' BETWEEN TRF_DATE_DEBUT AND TRF_DATE_FIN AND CHB_ID IN (1, 3, 5) Try never to use BLOB (TEXT, BLOB, CLOB ...) store your pictures, your long texts and your resources files directly involved in the files of the OS, it desemcombre database, which will process the data most useful faster. See for example the article "The images in my base". Try never to use the CASE in queries. Use the transformations UNION joints or at worst made this kitchen in the code of the client interface (see above No. 13). Do not type UNICODE for your strings if your application has no interest in being internayionalisee. Indeed columns UNICODE (NCHAR, NVARCHAR) are twice as long and coutent therefore double in processing time compared with columns of type or VARCHAR CHAR ... Specify always the list of columns in an order INSERT. Example: -- Ill INSERT INTO T_CLIENT VALUES (198, 'M.', 'DUCORNET', 'Archibald', NULL) -- Good INSERT INTO T_CLIENT (CLI_ID, TIT_CODE, CLI_NOM, CLI_PRENOM, CLI_ENSEIGNE) VALUES (198, 'M.', 'DUCORNET', 'Archibald', NULL) -- Excellent (insertion implicit NULL) INSERT INTO T_CLIENT (CLI_ID, TIT_CODE, CLI_NOM, CLI_PRENOM) VALUES (198, 'M.', 'DUCORNET', 'Archibald') Re your request to read the tables still in the same order (for example alphabetical order of names of tables) so as to prevent fatal locks and waiting too long due to interblocages. Download the server of the basic tasks that you can easily do on the client. For example, if you must submit the original name, do not ask the server, rappatriez names and select the first letter in the code of the client interface. Avoid the systematic use of a clause ORDER BY. If the order you want is complex, add a column ORDER and specify it manually. Example: -- You want to present a list of names of countries in the first France - second in the states of the European Union in alphabetical order - and all other third countries orde alphabet. CREATE TABLE T_PAYS (PAYS VARCHAR(16), UNION_EUROPEENE BIT(1)) INSERT INTO T_PAYS VALUES ('Germany', 1) INSERT INTO T_PAYS VALUES ('Austria', 1) INSERT INTO T_PAYS VALUES ('Spain', 1) INSERT INTO T_PAYS VALUES ('France', 1) INSERT INTO T_PAYS VALUES ('Ireland', 1) INSERT INTO T_PAYS VALUES ('Chile', 0) INSERT INTO T_PAYS VALUES ('China', 0) INSERT INTO T_PAYS VALUES ('Japan', 0) SELECT PAYS, 1 AS N FROM T_PAYS WHERE PAYS = 'France' UNION SELECT PAYS, 2 AS N FROM T_PAYS WHERE UNION_EUROPEENE = 1 AND PAYS <> 'France' UNION SELECT PAYS, 3 AS N FROM T_PAYS WHERE UNION_EUROPEENE = 0 ORDER BY N, PAYS - bad: heavy query! -- Adding a column order ALTER TABLE T_PAYS ADD ORDRE INTEGER --added order manual UPDATE T_PAYS SET ORDRE = 1 WHERE PAYS = 'France' UPDATE T_PAYS SET ORDRE = 2 WHERE PAYS = 'Germany' UPDATE T_PAYS SET ORDRE = 3 WHERE PAYS = 'Austria' UPDATE T_PAYS SET ORDRE = 4 WHERE PAYS = 'Spain' UPDATE T_PAYS SET ORDRE = 5 WHERE PAYS = 'Ireland' UPDATE T_PAYS SET ORDRE = 6 WHERE PAYS = 'Chile' UPDATE T_PAYS SET ORDRE = 7 WHERE PAYS = 'China' UPDATE T_PAYS SET ORDRE = 8 WHERE PAYS = 'Japan' SELECT PAYS, ORDRE FROM T_PAYS ORDER ORDRE Use UNION ALL if you do not want duplication in the result, it penalizes least the server. Try to liberate the HAVING clause eg imbriquant a sub request in the FROM clause. An index created in two columns can not be used to filter the second column because it stores data in principle on the concatenation of two columns. Restructurez indexes avoiding composite index. CREATE INDEX NDX_CLI_NOM_PRENOM ON T_CLIENT (CLI_NOM, CLI_PRENOM) SELECT * FROM T_CLIENT ORDRE BY PRENOM - the index can not be activated on the single column CLI_PRENOM because it contains: AIACHAlexandre ALBERTChristian AUZENATMichel BACQUEMichel BAILLYJean-François ... Activate the statistical calculations after massive updates. Make transactions as short as possible, confirm your transactions as soon as possible, put points partial backup. Where an application becomes too large, consider making a stored procedure set, especially if your RDBMS prepares requests. Avoid as much as possible to use CURSOR. Prefer queries same complex. Bannissez the CURSOR pacourus back (a clause ORDER BY ... DESC replaces easily). Ban you CURSOR traveled jumped by more than one line. WHAT MAKES WINNING TIME ... * Avoid large traps * Add information: tables, views, indexes to facilitate querying 11. CONCLUSION Sorry, there is no miracle recipe for optimizing queries. Just a few big mistakes to avoid. The main thing is a good setting of the machine and its environment, server and the OS. The rest is specific to each RDBMS and requires a little elbow grease and lots of caution. But do not forget that tests support are essential to decide between this or that expression of complaint with the likely volume of your core holding.
|
|
|