Sous Postgresql 8.2, PGadmin sous WindowsXP et Serveur sous Freebsd6.2.
J'ai deux tables A et B. A contient des ID en clef primaire. B contient des lignes d'info sur chaque ID de A (clef étrangère).
Mon but est de récupérer toutes les lignes de B associés aux ID de A dont la liste est fournit. La premiere chose qui m'ai venu à l'esprit est d'utiliser IN:
On voit qu'il ne passe pas par les index.
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11 explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id IN(110,106,109,105,108,104,107,111,112,113,114,115,116,117,118,119,120); => "Hash Join (cost=60.65..1342640.85 rows=2105472 width=127) (actual time=162111.979..182338.830 rows=4157352 loops=1)" " Hash Cond: (B.id = A.id)" " -> Seq Scan on B (cost=0.00..1106953.08 rows=57219308 width=37) (actual time=2.572..94668.692 rows=57219307 loops=1)" " -> Hash (cost=60.44..60.44 rows=17 width=90) (actual time=25.505..25.505 rows=17 loops=1)" " -> Seq Scan on A (cost=0.00..60.44 rows=17 width=90) (actual time=3.624..25.471 rows=17 loops=1)" " Filter: (id = ANY ('{110,106,109,105,108,104,107,111,112,113,114,115,116,117,118,119,120}'::bigint[]))" "Total runtime: 188175.473 ms"
Etant sidéré par le résultat, j'ai tenté le OR (bin oui quoi, on ne sait jamais )
Bon, ce test est normal, rien à dire sauf qu'il ne passe pas non plus par les index.
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 explain analyze SELECT * FROM B, A WHERE A.id=B.id AND (A.id =110 OR A.id=110 OR A.id=106 OR A.id=109 OR A.id=105 OR A.id=108 OR A.id=104 OR A.id=107 OR A.id=111 OR A.id=112 OR A.id=113 OR A.id=114 OR A.id=115 OR A.id=116 OR A.id=117 OR A.id=118 OR A.id=119 OR A.id=120); => "Hash Join (cost=71.63..1343890.36 rows=2229324 width=127) (actual time=184682.044..208278.921 rows=4157352 loops=1)" " Hash Cond: (B.id = A.id)" " -> Seq Scan on B (cost=0.00..1106953.08 rows=57219308 width=37) (actual time=3.083..108440.509 rows=57219307 loops=1)" " -> Hash (cost=71.41..71.41 rows=18 width=90) (actual time=25.823..25.823 rows=17 loops=1)" " -> Seq Scan on A (cost=0.00..71.41 rows=18 width=90) (actual time=4.866..25.783 rows=17 loops=1)" " Filter: ((id = 110) OR (id = 110) OR (id = 106) OR (id = 109) OR (id = 105) OR (id = 108) OR (id = 104) OR (id = 107) OR (id = 111) OR (id = 112) OR (id = 113) OR (id = 114) OR (id = 115) OR (id = 116) OR (id = 117) OR (id = 118) OR (id = 119) OR (id = 120))" "Total runtime: 215223.763 ms"
Le bourrin que je suis refait le même test avec des reqêtes SQL unique :
Gosh! On voit que maintenant, les index sont utilisés et le temps du résultat est vraiment autre chose!
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48 explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =110; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =106; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =109; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =105; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =108; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =104; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =111; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =112; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =113; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =114; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =115; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =116; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =117; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =118; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =119; explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =120; => "Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=10.054..107.973 rows=22479 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.016..0.021 rows=1 loops=1)" " Index Cond: (id = 120)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=10.028..44.841 rows=22479 loops=1)" " Recheck Cond: (120 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=6.746..6.746 rows=22479 loops=1)" " Index Cond: (120 = id)" "Total runtime: 139.458 ms" Bon, comme ce n'est que pour la derniere requête, voici le temps que m'a mis l'envoi de tous: => Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Le résultat de la requête contient 8 ligne(s) non-extraite(s). Durée totale d'exécution de la requête :31578 ms. 8 lignes récupérées.
Essayons maintenant avec une jointure:
Ici, les index sont utilisés. Mais ça ne me prends que la moitier du temps!
Code : Sélectionner tout - Visualiser dans une fenêtre à part
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136 explain analyze SELECT * FROM B, A WHERE A.id=B.id AND A.id =110 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =106 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =109 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =105 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =108 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =104 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =111 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =112 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =113 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =114 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =115 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =116 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =117 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =118 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =119 UNION SELECT * FROM B, A WHERE A.id=B.id AND A.id =120; => "Unique (cost=12118253.70..12498149.10 rows=10130544 width=127) (actual time=53045.073..76026.855 rows=3792894 loops=1)" " -> Sort (cost=12118253.70..12143580.06 rows=10130544 width=127) (actual time=53045.068..61569.970 rows=3792894 loops=1)" " Sort Key: id, trade_date, price, size, trade_type, id, full_name, name, bloomberg_id, exchange, currency, "type", country, last_intraday_trading_date" " -> Append (cost=14668.05..9554412.05 rows=10130544 width=127) (actual time=209.719..29140.871 rows=3792894 loops=1)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=209.715..1854.533 rows=381676 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=11.881..11.886 rows=1 loops=1)" " Index Cond: (id = 110)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=197.821..765.388 rows=381676 loops=1)" " Recheck Cond: (110 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=196.920..196.920 rows=381676 loops=1)" " Index Cond: (110 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=136.273..1306.497 rows=269992 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.019..0.024 rows=1 loops=1)" " Index Cond: (id = 106)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=136.234..544.500 rows=269992 loops=1)" " Recheck Cond: (106 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=135.629..135.629 rows=269992 loops=1)" " Index Cond: (106 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=155.711..1587.212 rows=332342 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.019..0.024 rows=1 loops=1)" " Index Cond: (id = 109)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=155.674..648.843 rows=332342 loops=1)" " Recheck Cond: (109 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=154.922..154.922 rows=332342 loops=1)" " Index Cond: (109 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=139.376..1352.241 rows=271366 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.019..0.023 rows=1 loops=1)" " Index Cond: (id = 105)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=139.338..585.931 rows=271366 loops=1)" " Recheck Cond: (105 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=132.109..132.109 rows=271366 loops=1)" " Index Cond: (105 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=150.368..1778.474 rows=365374 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.026 rows=1 loops=1)" " Index Cond: (id = 108)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=150.331..746.200 rows=365374 loops=1)" " Recheck Cond: (108 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=147.706..147.706 rows=365374 loops=1)" " Index Cond: (108 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=207.864..1454.164 rows=279586 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=6.012..6.017 rows=1 loops=1)" " Index Cond: (id = 104)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=201.835..658.534 rows=279586 loops=1)" " Recheck Cond: (104 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=199.319..199.319 rows=279586 loops=1)" " Index Cond: (104 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=102.347..1512.403 rows=317148 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.025 rows=1 loops=1)" " Index Cond: (id = 111)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=102.309..616.135 rows=317148 loops=1)" " Recheck Cond: (111 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=101.583..101.583 rows=317148 loops=1)" " Index Cond: (111 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=7.738..98.341 rows=20298 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.025 rows=1 loops=1)" " Index Cond: (id = 112)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=7.702..40.674 rows=20298 loops=1)" " Recheck Cond: (112 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=7.654..7.654 rows=20298 loops=1)" " Index Cond: (112 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=224.180..2502.560 rows=506542 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.018..0.023 rows=1 loops=1)" " Index Cond: (id = 113)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=224.144..1059.475 rows=506542 loops=1)" " Recheck Cond: (113 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=222.966..222.966 rows=506542 loops=1)" " Index Cond: (113 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=13.373..135.232 rows=27301 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.025 rows=1 loops=1)" " Index Cond: (id = 114)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=13.338..57.950 rows=27301 loops=1)" " Recheck Cond: (114 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=13.270..13.270 rows=27301 loops=1)" " Index Cond: (114 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=17.901..115.171 rows=21835 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.019..0.024 rows=1 loops=1)" " Index Cond: (id = 115)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=17.865..53.596 rows=21835 loops=1)" " Recheck Cond: (115 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=17.820..17.820 rows=21835 loops=1)" " Index Cond: (115 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=104.532..1225.125 rows=249396 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.019..0.024 rows=1 loops=1)" " Index Cond: (id = 116)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=104.496..521.122 rows=249396 loops=1)" " Recheck Cond: (116 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=103.916..103.916 rows=249396 loops=1)" " Index Cond: (116 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=170.937..2201.755 rows=462972 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.025 rows=1 loops=1)" " Index Cond: (id = 117)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=170.899..895.831 rows=462972 loops=1)" " Recheck Cond: (117 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=169.822..169.822 rows=462972 loops=1)" " Index Cond: (117 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=11.872..114.330 rows=23342 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.020..0.025 rows=1 loops=1)" " Index Cond: (id = 118)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=11.837..48.134 rows=23342 loops=1)" " Recheck Cond: (118 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=11.786..11.786 rows=23342 loops=1)" " Index Cond: (118 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=93.905..1152.346 rows=241245 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=7.603..7.609 rows=1 loops=1)" " Index Cond: (id = 119)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=86.286..463.921 rows=241245 loops=1)" " Recheck Cond: (119 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=85.766..85.766 rows=241245 loops=1)" " Index Cond: (119 = id)" " -> Nested Loop (cost=14668.05..590819.16 rows=633159 width=127) (actual time=11.806..110.855 rows=22479 loops=1)" " -> Index Scan using A_pkey on A (cost=0.00..8.27 rows=1 width=90) (actual time=0.022..0.028 rows=1 loops=1)" " Index Cond: (id = 120)" " -> Bitmap Heap Scan on B (cost=14668.05..584479.31 rows=633159 width=37) (actual time=11.766..47.290 rows=22479 loops=1)" " Recheck Cond: (120 = id)" " -> Bitmap Index Scan on id_trade_idx (cost=0.00..14509.76 rows=633159 width=0) (actual time=7.677..7.677 rows=22479 loops=1)" " Index Cond: (120 = id)" "Total runtime: 81346.953 ms"
Comment expliquez-vous tout ceci? Y a t'il des implémentations à faire pour pouvoir utiliser la notation IN?
Y a t'il des trucs que j'ai pas vu dans les explain?
Ou bien est-ce là une preuve que l'utilisation de IN n'est pas optimale?
Partager