I have a bit of a messy query to try figure out.
I have a column called "meta_value" and in that I have some HTML data such as:
-
Depending on how frequently you need to perform this and the size of the dataset, I would probably extract this data out in to a new table. I would create a table with pk, card_name (unique), count and then write a command in the application to iterate over the existing data to parse the out the names from the tag bodies or the data-name attribute in the html and create the row or update the count in the row, and then make the change in the application to make sure that column gets updated whenever meta_value changes.
Doing it this way and just sorting based on the count field will be more performant when for this specific lookup, but it will also make this query still be valid should the structure of your html ever change. It also allows for you at add other properties to these items
讨论(0)
-
Following is purely MySQL-only solution; you can run this query once (or twice) a day in off-peak hours, to update the count in a cache/summary table. Moreover, number of rows are roughly around 6000 (only), so (depending on your server configuration), it should not cause performance issues.
Now, since the number of cards in a particular row is variable (can range from 40-60), we can use a Sequence table. You can define a permanent table in your database storing integers ranging from 1 to 100 (you may find this table helpful in many other cases as well):
CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1), (2), ...... , (99), (100);
Now, we will do a JOIN
between wph3_postmeta
and seq
table, based on the count of occurrence of substring 'data-name=""'
inside the specific meta_value
. We can get the count of occurrence of the substring (which also means, count of cards in a particular row) using:
(
CHAR_LENGTH(wp.meta_value)
- CHAR_LENGTH(REPLACE(wp.meta_value, 'data-name=""', ''))
) / CHAR_LENGTH('data-name=""')
Now, we can use the Substring_Index() function to extract the card values out. Using the different numbers in different row, we can basically extract out the first card, second card, and so on...
Once we have extracted all the words out, in separate rows; we can then use the complete result-set as a Derived Table, and perform the aggregation queries to get the required results:
Query (View on DB Fiddle)
SELECT dt.name,
Count(DISTINCT dt.meta_id) AS unique_metaid_count
FROM (SELECT wp.meta_id,
Substring_index(Substring_index(wp.meta_value, 'data-name=""',
-seq.n),
'"">', 1
) AS name
FROM wph3_postmeta AS wp
JOIN seq
ON ( Char_length(wp.meta_value) - Char_length(
REPLACE(wp.meta_value,
'data-name=""'
,
'')) ) /
Char_length('data-name=""') >= n
WHERE wp.meta_key = 'deck_list') AS dt
GROUP BY dt.name
ORDER BY unique_metaid_count DESC
/* To get top 10 counts only, add LIMIT 10 */
Result
| name | unique_metaid_count |
| --------------------------------------------- | ------------------- |
| Call of the Haunted | 2 |
| Inferno Reckless Summon | 2 |
| Mystic Box | 2 |
| Mystical Space Typhoon | 2 |
| Number 39: Utopia | 2 |
| #created by ygopro2 | 1 |
| 98095162 | 1 |
| Abyss Dweller | 1 |
| Advanced Ritual Art | 1 |
| Armed Dragon LV3 | 1 |
| Armed Dragon LV5 | 1 |
| Axe of Despair | 1 |
| B.E.S. Covered Core | 1 |
.....
| The Dragon Dwelling in the Cave | 1 |
| The Flute of Summoning Dragon | 1 |
| The Forces of Darkness | 1 |
| Threatening Roar | 1 |
| Time Machine | 1 |
| Torike | 1 |
| Tornado Dragon | 1 |
| Torrential Tribute | 1 |
| Tragoedia | 1 |
| Trap Hole | 1 |
| Treeborn Frog | 1 |
| Trishula, Dragon of the Ice Barrier | 1 |
| Twin Twisters | 1 |
| Vanity's Ruler | 1 |
| Wind-Up Snail | 1 |
| Wind-Up Soldier | 1 |
| Wulf, Lightsworn Beast | 1 |
| Zure, Knight of Dark World | 1 |
Note: If you want Top 10 only (by count), you can simply add LIMIT 10
at the end of the query.
讨论(0)
-
I guess you need a simple LIMIT and ORDER BY Clause -
SELECT meta_value, COUNT(DISTINCT meta_value) VALUE_COUNT
FROM wph3_postmeta
WHERE meta_key = "deck_list"
AND meta_value REGEXP '>[[:alnum:]]+(</a>)$'
GROUP BY meta_value
ORDER BY VALUE_COUNT DESC
LIMIT 10;
讨论(0)
- 热议问题