问题
Similiar to a question asked here,
Given, this table, I want to only keep the records where the email appears first.
firstname | Lastname | Address | City | Zip | |
---|---|---|---|---|---|
ABC@XYZ.com | Scott | Johnson | A | Z | 1111 |
ABC@XYZ.com | Bill | Johnson | B | Y | 2222 |
ABC@XYZ.com | Ted | Smith | C | X | 3333 |
DEF@QRP.com | Steve | Williams | D | W | 4444 |
XYZ@LMN.com | Sam | Samford | E | U | 5555 |
XYZ@LMN.com | David | Beals | F | V | 6666 |
DEF@QRP.com | Stephen | Jackson | G | T | 7777 |
TUV@DEF.com | Seven | Alberts | H | S | 8888 |
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>email</th>
<th>firstname</th>
<th>Lastname</th>
<th>Address</th>
<th>City</th>
<th>Zip</th>
</tr>
</thead>
<tbody>
<tr>
<td>ABC@XYZ.com</td>
<td>Scott</td>
<td>Johnson</td>
<td>A</td>
<td>Z</td>
<td>1111</td>
</tr>
<tr>
<td>ABC@XYZ.com</td>
<td>Bill</td>
<td>Johnson</td>
<td>B</td>
<td>Y</td>
<td>2222</td>
</tr>
<tr>
<td>ABC@XYZ.com</td>
<td>Ted</td>
<td>Smith</td>
<td>C</td>
<td>X</td>
<td>3333</td>
</tr>
<tr>
<td>DEF@QRP.com</td>
<td>Steve</td>
<td>Williams</td>
<td>D</td>
<td>W</td>
<td>4444</td>
</tr>
<tr>
<td>XYZ@LMN.com</td>
<td>Sam</td>
<td>Samford</td>
<td>E</td>
<td>U</td>
<td>5555</td>
</tr>
<tr>
<td>XYZ@LMN.com</td>
<td>David</td>
<td>Beals</td>
<td>F</td>
<td>V</td>
<td>6666</td>
</tr>
<tr>
<td>DEF@QRP.com</td>
<td>Stephen</td>
<td>Jackson</td>
<td>G</td>
<td>T</td>
<td>7777</td>
</tr>
<tr>
<td>TUV@DEF.com</td>
<td>Seven</td>
<td>Alberts</td>
<td>H</td>
<td>S</td>
<td>8888</td>
</tr>
</tbody>
</table>
Expected output table:
firstname | Lastname | Address | City | Zip | |
---|---|---|---|---|---|
ABC@XYZ.com | Scott | Johnson | A | Z | 1111 |
DEF@QRP.com | Steve | Williams | D | W | 4444 |
XYZ@LMN.com | Sam | Samford | E | U | 5555 |
TUV@DEF.com | Seven | Alberts | H | S | 8888 |
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>email</th>
<th>firstname</th>
<th>Lastname</th>
<th>Address</th>
<th>City</th>
<th>Zip</th>
</tr>
</thead>
<tbody>
<tr>
<td>ABC@XYZ.com</td>
<td>Scott</td>
<td>Johnson</td>
<td>A</td>
<td>Z</td>
<td>1111</td>
</tr>
<tr>
<td>DEF@QRP.com</td>
<td>Steve</td>
<td>Williams</td>
<td>D</td>
<td>W</td>
<td>4444</td>
</tr>
<tr>
<td>XYZ@LMN.com</td>
<td>Sam</td>
<td>Samford</td>
<td>E</td>
<td>U</td>
<td>5555</td>
</tr>
<tr>
<td>TUV@DEF.com</td>
<td>Seven</td>
<td>Alberts</td>
<td>H</td>
<td>S</td>
<td>8888</td>
</tr>
</tbody>
</table>
回答1:
There is no inherent ordering of a table in DAX, so in order to take the first row you need to add an index column or define an ordering on the table somehow.
For this answer, I'll assume that you've added an index column somehow (in the query editor or with a DAX calculated column).
You can create a filtered table as follows:
FilteredTable1 =
FILTER (
Table1,
Table1[Index]
= CALCULATE ( MIN ( Table1[Index] ), ALLEXCEPT ( Table1, Table1[email] ) )
)
For each row in Table1
, this checks if the index is minimal over all the rows with the same email.
回答2:
Assuming that we added an Index
column with non duplicate values, it's possible to reduce the number of context transitions to only one per Email by preparing an Indexes
table containing the indexes to be selected, and then apply this Indexes
table as a filter using TREATAS.
T Index Unique =
VAR Indexes =
SELECTCOLUMNS(
ALL( 'T Index'[Email] ),
"MinIndex", CALCULATE( MIN( 'T Index'[Index] ) )
)
RETURN
CALCULATETABLE( 'T Index', TREATAS( Indexes, 'T Index'[Index] ) )
If instead we have non-unique column across the different Emails but unique per each email, like a timestamp, we can prepare a filter table containing the email and the timestamp
For instance with a T Date
table like the following
The calculated table becomes
T Date Unique =
VAR EmailDate =
ADDCOLUMNS(
ALL( 'T Date'[Email] ),
"MinDate", CALCULATE( MIN( 'T Date'[Date] ) )
)
RETURN
CALCULATETABLE( 'T Date', TREATAS( EmailDate, 'T Date'[Email], 'T Date'[Date] ) )
来源:https://stackoverflow.com/questions/65397214/follow-up-dax-drop-duplicates