问题
When i run a select after a number of joins on my table I have an output of 2 columns and I want to select a distinct combination of col1 and col2 for the rowset returned.
the query that i run will be smthing like this:
select a.Col1,b.Col2 from a inner join b on b.Col4=a.Col3
now the output will be somewhat like this
Col1 Col2
1 z
2 z
2 x
2 y
3 x
3 x
3 y
4 a
4 b
5 b
5 b
6 c
6 c
6 d
now I want the output should be something like follows
1 z
2 y
3 x
4 a
5 b
6 d
its ok if I pick the second column randomly as my query output is like a million rows and I really dnt think there will be a case where I will get Col1 and Col2 output to be same even if that is the case I can edit the value..
Can you please help me with the same.. I think basically the col3 needs to be a row number i guess and then i need to selct two cols bases on a random row number.. I dont know how do i transalte this to SQL
consider the case 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e now group by will give me all these results where as i want 1a and 2d or 1a and 2b. any such combination.
OK let me explain what im expecting:
with rs as(
select a.Col1,b.Col2,rownumber() as rowNumber from a inner join b on b.Col4=a.Col3)
select rs.Col1,rs.Col2 from rs where rs.rowNumber=Round( Rand() *100)
now I am not sure how do i get the rownumber or the random working correctly!!
Thanks in advance.
回答1:
If you simply don't care what col2
value is returned
select a.Col1,MAX(b.Col2) AS Col2
from a inner join b on b.Col4=a.Col3
GROUP BY a.Col1
If you do want a random value you could use the approach below.
;WITH T
AS (SELECT a.Col1,
b.Col2
ROW_NUMBER() OVER (PARTITION BY a.Col1 ORDER BY (SELECT NEWID())
) AS RN
FROM a
INNER JOIN b
ON b.Col4 = a.Col3)
SELECT Col1,
Col2
FROM T
WHERE RN = 1
Or alternatively use a CLR Aggregate function. This approach has the advantage that it eliminates the requirement to sort by partition, newid()
an example implementation is below.
using System;
using System.Data.SqlTypes;
using System.IO;
using System.Security.Cryptography;
using Microsoft.SqlServer.Server;
[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public struct Random : IBinarySerialize
{
private MaxSoFar _maxSoFar;
public void Init()
{
}
public void Accumulate(SqlString value)
{
int rnd = GetRandom();
if (!_maxSoFar.Initialised || (rnd > _maxSoFar.Rand))
_maxSoFar = new MaxSoFar(value, rnd) {Rand = rnd, Value = value};
}
public void Merge(Random group)
{
if (_maxSoFar.Rand > group._maxSoFar.Rand)
{
_maxSoFar = group._maxSoFar;
}
}
private static int GetRandom()
{
var buffer = new byte[4];
new RNGCryptoServiceProvider().GetBytes(buffer);
return BitConverter.ToInt32(buffer, 0);
}
public SqlString Terminate()
{
return _maxSoFar.Value;
}
#region Nested type: MaxSoFar
private struct MaxSoFar
{
private SqlString _value;
public MaxSoFar(SqlString value, int rand) : this()
{
Value = value;
Rand = rand;
Initialised = true;
}
public SqlString Value
{
get { return _value; }
set
{
_value = value;
IsNull = value.IsNull;
}
}
public int Rand { get; set; }
public bool Initialised { get; set; }
public bool IsNull { get; set; }
}
#endregion
#region IBinarySerialize Members
public void Read(BinaryReader r)
{
_maxSoFar.Rand = r.ReadInt32();
_maxSoFar.Initialised = r.ReadBoolean();
_maxSoFar.IsNull = r.ReadBoolean();
if (_maxSoFar.Initialised && !_maxSoFar.IsNull)
_maxSoFar.Value = r.ReadString();
}
public void Write(BinaryWriter w)
{
w.Write(_maxSoFar.Rand);
w.Write(_maxSoFar.Initialised);
w.Write(_maxSoFar.IsNull);
if (!_maxSoFar.IsNull)
w.Write(_maxSoFar.Value.Value);
}
#endregion
}
回答2:
You need to group by a.Col1
to get distinct by only a.Col1
, then since b.Col2
is not included in the group you need to find a suitable aggregate function to reduce all values in the group to just one, MIN is good enough if you just want one of the values.
select a.Col1, MIN(b.Col2) as c2
from a
inner join b on b.Col4=a.Col3
group by a.Col1
回答3:
If I understand you correctly, you want to have one line for each combination in column 1 and 2. That can easily be done by using GROUP BY or DISTINCT for instance:
SELECT col1, col2
FROM Your Join
GROUP BY col1, col2
回答4:
You must use a group by
clause :
select a.Col1,b.Col2
from a
inner join b on b.Col4=a.Col3
group by a.Col1
来源:https://stackoverflow.com/questions/5210631/selecting-a-distinct-combination-of-2-columns-in-sql