问题
I find "The functions ets:select/2 and mnesia:select/3 should be preferred over ets:match/2,ets:match_object/2, and mnesia:match_object/3" form ref link : http://www.erlang.org/doc/efficiency_guide/tablesDatabases.html
And I'd read some essay about comparing between select and match, I conclude there are some factor effecting the result, such as records' amount in table, select/match a primary key or not, table kind(bag, set...), etc.
In my test, I do for all kind of table with 10W records and 1W records, and only select/match for a un-primary key.
the code following:
select_ets_test(Times) ->
MS = ets:fun2ms(fun(T) when T#ets_haoxian_template.count == 15 -> T end),
T1 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_bag, MS) end, Times]),
T2 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_set, MS) end, Times]),
T3 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_ordered_set, MS) end, Times]),
T4 = timer:tc(?MODULE, todo, [fun() -> ets:select(haoxian_test_duplicate_bag, MS) end, Times]),
io:format("select bag : ~p~n", [T1]),
io:format("select set : ~p~n", [T2]),
io:format("select ordered_set : ~p~n", [T3]),
io:format("select duplicate bag : ~p~n", [T4]).
match_ets_test(Times) ->
MS = #ets_haoxian_template{count = 15, _ = '_' },
T1 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_bag, MS) end, Times]),
T2 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_set, MS) end, Times]),
T3 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_ordered_set, MS) end, Times]),
T4 = timer:tc(?MODULE, todo, [fun() -> ets:match_object(haoxian_test_duplicate_bag, MS) end, Times]),
io:format("match bag : ~p~n", [T1]),
io:format("match set : ~p~n", [T2]),
io:format("match ordered_set : ~p~n", [T3]),
io:format("match duplicate bag : ~p~n", [T4]).
todo(_Fun, 0) ->
ok;
todo(Fun, Times) ->
Fun(),
todo(Fun, Times - 1).
the record would like: #ets_haoxian_template{type = X, count = Y, ...}, keypose is type.
the resule following: 1W test:
insert bag : {324000,true}
insert set : {221000,true}
insert ordered_set : {108000,true}
insert duplicate bag : {173000,true}
select bag : {284000,ok}
select set : {255000,ok}
select ordered_set : {221000,ok}
select duplicate bag : {252000,ok}
match bag : {238000,ok}
match set : {192000,ok}
match ordered_set : {136000,ok}
match duplicate bag : {191000,ok}
10W test:
insert bag : {1654000,true}
insert set : {1684000,true}
insert ordered_set : {981000,true}
insert duplicate bag : {1769000,true}
select bag : {3404000,ok}
select set : {3433000,ok}
select ordered_set : {2501000,ok}
select duplicate bag : {3678000,ok}
match bag : {2749000,ok}
match set : {2927000,ok}
match ordered_set : {1748000,ok}
match duplicate bag : {2923000,ok}
It seem match is better than select? Or my test something wrong???
回答1:
The match
function employs a special tuple syntax (match_pattern) to decide what to return.
The select
function employs a special tuple syntax (match_spec) that is a superset of match_pattern
, with the ability to specify guards and extract elements from the result set (rather than just returning the matching keys).
My understanding is that:
select
compiles thematch_spec
into an anonymous function, expediting how fast it runs- the ability to provide guards to this function eliminates false positives quicker than is possible with just a
match_pattern
(since they will run first) - the ability to extract elements from the result set in-place saves you work you would have to do later, rather than iterating over the returned keys to extract that data.
In trivial non-specific use-cases, select
is just a lot of work around match
. In non-trivial more common use-cases, select
will give you what you really want a lot quicker.
来源:https://stackoverflow.com/questions/29505161/erlang-ets-select-and-match-performance