|
|
Tuesday, 27 May 2008
|
1. Quality of sound key The qualities required for a piece of information is candidate for key (often called primary key, which is a pleonasm whose purpose is to oppose the concept of terminologiquement key index or secondary, or even key foreign) are as follows: Concision Uniqueness Stability Brevity because, as the saying goes: "what is good designs clearly states", saying that is expected. That would be a sign of VPC whose references goods would consist of a code of 32 characters should be communicating over the phone to increase its command? Uniqueness because the key is to find all the information relating to an individual item, not a group or an ensemble. What if a brouhaha cacophonique same telephone number was assigned a different subscribers! Stability finally, it should minimize the efforts of memory! How many times pestons us against these people or firms who Appareo change information (telephone, fax, e…) or who "relookent" their repository regularly by changing the types and values of their classifications? For the purpose computers, a key perfect would be a key to a zero cost for its storage, universally unique (even from one company to another, that begins to make some computer systems) and not undergoing any change, alteration or suppression, even after the death of element it represents. More modestly, a good key would be a negligible cost, for example, a full numerical value, of course unique, at least within the model it represents and finally awarded once and for all during the entire life of the object it identifies. 2. Key key natural or digital? The old debate that is to wonder whether to use a natural key, ie among the columns of information on the table, or a key bine "artificial" own the computer system is at least as old as the 'Computer itself. Indeed, the use of keys is a need to be found from the outset of computer files, and many references now well known, derive from this notion: zip code SCR companies, registration motor vehicles, social security number… as much information created specifically for the needs of Census and Statistics, by the legislature. The question is, should we use one of these references in a data model or is it to add a new, clean information system? The debate can be settled quickly by analyzing the behaviour of these keys in the life of information and more generally elements "live" hiding behind concepts that computer models. Take the social security number: immutable, it is attributed to "life". Is there a good key for a computer file? Unfortunately not! A foreigner, meaning a person does not fall under the system of employment in France, does not have such a reference. Even though it would work a provisional number would be assigned, pending to obtain a definitive reference from the hotel. As another example, the grey card and hence the registration, a key seems ideal for modeling a fleet of motor vehicles land. Unfortunately it must be remembered that before obtaining a final registration by the prefecture, which is the garage, under the supervision of the administration, issue a provisional certificate. But there is worse… In the early eighties the department of the Marne in Champagne region, the tax had the most favourable in terms of car stickers. The car rental companies began to register all their vehicles in the department to make a foot from nose to the Minister of Finance. That minister, then courrouce, said a law in order to recover part of the expected windfall tax, which had disappeared because of the Machiavellian plan of rental cars. How many vehicles have seen their registration and change during their possession? One could also cite the case of companies, bought, merged, acquired holdinguisees or exported, including the number of companies register, which is supposed to be immutable, has been altered or deleted! It seems clear that using a natural key is a very bad idea! And hence, a good old number is perfectly the case for most cases. We must know how to generate its value! 3. The auto increment Calculate any new key value, when it turns out to be digital in nature and if possible a whole, is a disconcerting ease. It is neither more nor less than increment, ie add a unit, the highest values already assigned. When this mechanism is automated, it is called, simply, auto increment. If the problem does not lie in the calculation itself, crippling the point is actually in place or code performing this self increment is implemented! 3.1. A false good idea The easiest way to achieve such a mechanism is to find the maximum already assigned within the data from all concerned and add value "one". For example: SELECT MAX (LaColonneClef) + 1 FROM MaTable Whose SQL precisely this reference value. But this piecemeal approach has two very stringent limits: ensure that the key is ALWAYS used and not abandoned refrain from competing users Indeed such a mechanism is in no way capable of performing its duties properly if a backup or archiving can be relocated, or if the application is used by different users simultaneously. Discuss it for archiving ... At a given moment, the table contains MaTable the key 48 attributed to Mr Paul DUFOUR which is the highest key value. It is carried out an archiving. Then the reference line 48 is deleted from the table. The biggest key reference becomes 47. A new key is generated by the above mechanism for capturing data Alain DUMAS, and this is key to new 48 (47 + 1). For one reason or another, we need to resume lines archived this table… Unfortunately it is impossible to resume the archive since the key 48 of DUFOUR was re-assigned to DUMAS! Of course the case of resumption of a backup led to similar problems ...
But the most serious case lies in the multi-user ... The user A, proceeded to acquire a key to enter data relating to Mr Gilles LEBLANC. He was awarded the key 53 since the last key value stored in the table MaTable is 52. Moments later, user B should seize the information Mr Pierre Lenoir and is also awarded him a key. You will have guessed that A has not yet finished entering its information and therefore user B is assigned the same key value that the user A, namely 53. As long as the code has been written just to "rush" it is not impossible to believe that A can validate information on a LEBLANC daughter in a table, while B has to validate the information Mr LENOIR . Thus, information of an individual is assigned to another thing recently seen at the highest level of our desperate government's notice ... We retain a conclusion that such a solution should be avoided. But it gives us two interesting elements to conduct the discussions which will put us on the path of the safest algorithm: One key used must never be re-assigned, in other words: any key consumed is lost throughout the period that runs from the calculation of the new key to its inclusion in the table any user of the computer system should be able to take another of equal value Hence, and paradoxically, we can not count on the table itself for the value assigned to the new key to calculate. The result is obvious: the mechanism for calculating the new key must be EXTERNAL the table! 3.2. The solution: a table of key One of the solutions is to achieve within the database, a table containing the latest value of each key tables. Such a table could be as follows: CREATE TABLE LesClefs (NomTable CHAR (128) NOT NULL PRIMARY KEY, ValeurClef INTEGER NOT NULL DEFAULT 0) It could contain the following elements: NomTable ValeurClef -------------------------------- ----------------- MaTable 58 UneAutreTable 1587 ... There is more to achieve the mechanism for calculating the key calculation must be done in conjunction with updating the table of key, hence the idea of a transaction. UPDATE LesClefs SET ValeurClefs = ValeurClefs + 1 = WHERE NomTable 'MaTable' SELECT ValeurClef FROM LesClefs WHERE NomTable = 'MaTable' COMMIT By default standard SQL considers that the first order SQL past starts a transaction That is why in our code we have not placed an order BEGIN TRANSACTION, also non-existent in the standard! Attention, however, because this is not always true in modern RDBMS practicing mostly the self-commit ... Of course this can be placed in a stored procedure that can perform all operations, or even create the table if this latest adventure does not exist. One such example is reproduced below with a refinement that is to leave open the possibility to choose the column that should be self-incremented, even accepting the idea to allow several columns to be self incrementees. Such a procedure can be stored in a known trigger, as from a host language. Here is an example for WinDev: Depending SQLIdentAuto (NomFic) / / Calculate the next automatic identification / / / / Auto-incremented / / / / Rend -1 if something goes wrong / / IdentAuto is a Long if SQLExec ( "SP_SYS_DB_CALC_NEW_KEY" + NomFic +"'"," MAXREQ ") then SqlAssocie ( "MAXREQ" IdentAuto) SQLPremier ( "MAXREQ") otherwise IdentAuto = -1 error ( "Error increment") end return IdentAuto SQLFerme ( "MAXREQ") end 4. The internal mechanisms to RDBMS Publishers RDBMS have proposed several solutions for auto increment tables. Thus RDBMS as Paradox or Access columns offer a type AUTOINC ie an integer whose value is calculated each new entry. Of course each new key value calculated is regarded as consumed, as is so well that one does not get any "loopholes". Some RDBMS propose to use a particular object of the database established by a "generator" capable of providing a full auto incremented each call. This is the case of Oracle and InterBase (Borland). Here is an example for the RDBMS InterBase of BORLAND: Creation of generator: GENERATOR monGenerateur TO CREATE 2301; Who sets to book a space to store the value of auto increment monGenerateur name, beginning with the 2301 value. It may then require that all keys are calculated by this particular mechanism using a trigger or not knowingly use that for example during an insertion of data. Here is an example of a trigger mechanism that achieves this: AUTOINC_CLI CREATE TRIGGER FOR maTable BEFORE INSERT AS BEGIN NEW.laClef = GEN_ID (monGenerateur, 1); END NEW.laClef is the value of the column after passage through the trigger and GEN_ID a function call generator and calculating its increment. ATTENTION: if you want to know the value of the key after an insertion should not be questioning the generator, because the latter may have already been called by a competitor for a new entry. In this case it is necessary to generate the key before insertion and outside trigger: BEGIN TRANSACTION DECLARED NEW_ID INTEGER SET NEW_ID = GEN_ID (monGenerateur, 1) INSERT INTO MaTable (ID, VALUE) VALUES (NEW_ID, 'xxx') ... COMMIT TRANSATION Unfortunately this is not possible in SQL Server v7, RDBMS because it does not support triggers BEFORE and AFTER! Aware of the problem, the publisher, however, provides a mechanism early enough self generation of key incremented, definable when creating the table, using the keyword IDENTITY [(valeur_initiale, increment)]. The conditions are as follows: default value and initial increment worth 1 one column of each table can accept such a constraint it is possible to disconnect or reconnect the meter by manipulating the value of the variable IDENTITY_INSERT (ON or OFF) it is possible to know the value of key inserted in the table with the variable @ @ IDENTITY Example for SQL Server v7: -- Creating a table with auto incremented column: CREATE TABLE maTable (LaClef INTEGER IDENTITY (6852.1) NOT NULL PRIMARY KEY, UneColonne VARCHAR (32)) - Recovery of value added: INSERT INTO maTable (UneColonne) VALUES ( 'example') SELECT @ @ IDENTITY ---------------------------------------- 6852 - disconnection of the autoincrement for manual insertion: SET ON IDENTITY_INSERT maTable INSERT INTO maTable (LaClef, UneColonne) VALUES (623, 'example 2') INSERT INTO maTable (LaClef, UneColonne) VALUES (998877, 'example 3) IDENTITY_INSERT maTable SET OFF - Verification INSERT INTO maTable (UneColonne) VALUES ( 'example 4') SELECT * FROM maTable LaClef UneColonne ----------- ------------- 623 ------------------- example 2 6852 example 998877 example 3 998878 example 4 5. CONCLUSION If it appears that the specific mechanisms provided by publishers RDBMS be more efficient than the technique of the table of key measure this performance is however quite low. On the other hand, the method based table of key one is portable across all RDBMS without major changes. In addition some refinements can be made to this method for example by storing the starting value and the increment to meet all requirements. We therefore choose the method based table keys in the event of an evolution of RDBMS and the specific methods RDBMS if one looks for the performance at all costs.
|
|
|