What is canonical cover, closure and extraneous attribute?
Solution 1
I know I'm late, but maybe someone drops by somewhen.
I think you made some mistakes:
For the closures:
-
B+
should beABCDEF
rather thanBCDEF
because of the FDC → A
-
C+
should beAC
(the closure of an attribute always contains itself) -
G+
isG
, see reason of the second bullet
To calculate the canonical cover, follow this algorithm. You need to look at your list of functional dependencies:
- Left reduction: try to remove all attributes of the left side of the arrow that are not necessary to create the same closure.
To make the first example,
AB → C
, you can calculateAB
s closure, which would beABCDEF
. You then try to removeA
, ending up withB → C
. Now you calculate the closure ofB
only, which is stillABCDEF
-> you can remove A. At the end of this step, your FD should look like{B → C, B → E, C F → D, C → A, B → F, C E → F, C D → B, B → C, G → G}
. - Now you make the same thing for the right hand side. Note that you can remove all attributes here if you want, ending up with the empty set. As an example, look at
B → F
: the closure ofB
isABCDEF
. If you remove theF
from the functional dependency, ending up withB → ∅
, you still got the same closure forB
as before. Repeat that for the other FDs. You should end up with{B →∅, B → E, C F → D, C → A, B →∅, C E → F, C D → B, B → C, G →∅}
. - Remove all FDs of the form
X → ∅
. You end up with{B → E, C F → D, C → A, C E → F, C D → B, B → C}
. - Combine all FDs that have the same left-side of the arrow, which leads to a canonical cover of
{B → C E, C F → D, C → A, C E → F, C D → B}
.
For the superkeys: see this SO answer
Solution 2
Are the closures A+,B+,C+,D+,E+,F+ calculated this way?
What happened to "G"? Its absence here is significant. Do you know why?
If I'm not mistaken then BCDEFG is a super key (”the whole key”) in 1NF/2NF but is it minimal (3NF)?
Superkey (one word, no spaces) doesn't mean the whole key; it just means a key. The set of all attributes is a trivial superkey, so {ABCDEFG} is a trivial superkey.
Since B->C and C->A (a transitive dependency), you can reduce the trivial superkey to {BCDEFG}. Several more reductions are possible, so {BCDEFG} isn't a minimal superkey. {BCDEFG} is a reducible superkey.
One of the minimal superkeys is {BG}. (I could say, "{BG} is an irreducible superkey.") There are other minimal superkeys.
What else should be done to normalize this example to 1NF, 2NF and 3NF with the help of closures and canonical cover?
Just in case you have a common misunderstanding of this, it's not generally possible to normalize to 2NF and no higher, or to normalize to 3NF and no higher. Eliminating partial-key dependencies ("normalizing to 2NF") can leave all your relations in 5NF.
The next step is to determine all the candidate keys (all the irreducible superkeys).
Solution 3
My answer is derived from the Algorithm given in Database System Concepts by Korth.
Are the closures A+,B+,C+,D+,E+,F+ calculated this way? Steps to calculate the closure of (A,B,C,D,E,F) under F let say take for B
- result = {B}
- repeat
- for each functional dependency(e.g
B -> E
) inF
do - begin
- if
B
is subset of result - then
result(i.e B) = result(i.e B) U {E}
- end
- until(result stops changing)
In this way the closures of the following will be: A+ = A
B+ = ABCDEF
C+ = AC
D+ = D
E+ = E
F+ = F
How to check an attribute is extraneous: An attribute A is extraneous in a dependency alpha(AB) -> beta(C) if
1) A belongs to beta(which is not in the current case) then create new FD
F' = (F-{alpha -> beta}) U {alpha -> (beta - alpha)}
and check if alpha+ under F'(**not F**) includes A
, then A
is extraneous in beta
.
2) A belongs to alpha(which is correct), then create a new gamma{B} = alpha({AB}) - {A}
and check if gamma+(i.e B+)
under **F** i.e ABCDEF
includes all attributes in beta({C})
and which is true. So A is extraneous in AB->C
.
similarly check if C
is extraneous in AB->C
. So by above suggested algo
F' : AB -> NULL; B →E; CF →D; C →A; B →F; CE →F; CD →B; B →C
- Compute
AB+
underF'
i.eABCDEF
which includesC
. SoC
is extraneous inAB-> C
.
How to compute canonical cover?
Algo:
F' = F
- If there is any FD like
A->B and A->C
then replace byA->BC
(by union rule) Here the F' become :AB -> C; B-> CEF; C -> A; CD-> B; CE-> F; CF-> D
- Find the extraneous(left/right) in F' i.e
A is extraneous in AB->C
so removeA from AB->C
, so that it becomesB->C
and updateF'
. - Now check if the F' is changed as previous. If changed goto step 2 and repeat until find the F' no longer changes.
(Explaining further iterations as below:
itr2:
F' : B -> C; B-> CEF; C -> A; CD-> B; CE-> F; CF-> D
F' : B-> CEF; C -> A; CD-> B; CE-> F; CF-> D
Now check C in B-> CEF
, which is not extraneous
Check E , which is also not extraneous.
check F ,which is extraneous.
So new F' : B-> CE; C -> A; CD-> B; CE-> F; CF-> D
itr3:
F' : B-> CE; C -> A; CD-> B; CE-> F; CF-> D
after this there is no further extraneous attribute found.
So the canonical cover of F are :
B-> CE
C -> A
CD-> B
CE-> F
CF-> D
Let me know if have some error in logic suggested above.
Niklas Rosencrantz
I'm as simple as possible but not any simpler.
Updated on June 05, 2022Comments
-
Niklas Rosencrantz almost 2 years
I'm studying database concepts and there are 3 concepts that I don't understand: canonical cover, extraneous attribute and closure. I read the definition about canonical cover but I don't get the picture of how it relates to 3NF and BCNF. The definition of canonical cover appears to be that there are no extraneous attributes and extraneous attributes are attributes that don't change the closure of the set of functional dependencies and closure is the set of all functional dependencies implied by F, a set of functional dependencies.
But all this is a little fuzzy and I'd like to know both an intuitive definition and how to calculate
- Canonical cover
- Closure
- Extraneous attribute
Functional dependency I believe I understand--it's like what would have been the PK in a table if we had those attributes in a table.
There is a rather extensive answer at database refinement - minimal cover of F (extraneous attributes) but I found it difficult to read all the set definitions and algebra and I'd rather have definitions in plain English.
For example, having the schema U={A,B,C,D,E,F,G} and the functional dependencies
AB → C B → E CF → D C → A B → F CE → F CD → B B → C
Are the closures A+,B+,C+,D+,E+,F+ calculated this way?
A+ = A B+ = BCDEF C+ = A D+ = D E+ = E F+ = F
If I'm not mistaken then BCDEFG is a superkey (”the whole key”) in 1NF/2NF but is it minimal (3NF)?
What else should be done to normalize this example to 1NF, 2NF and 3NF with the help of closures and canonical covers? Is canonical cover the same as minimal cover?