Postgresql:Generate Sequence

后端 未结 2 369
暗喜
暗喜 2021-01-28 16:10

The query below generates a line of DNA sequence

prepare dna_length(int) as
with t1 as (select chr(65) as s union select chr(67) union select chr(71) union select         


        
相关标签:
2条回答
  • 2021-01-28 16:43

    Something like this?

    select x, string_agg((array['A', 'C', 'G', 'T'])[1 + floor(random() * 4)], '')
    from generate_series(1, 20, 1) gsn(n) cross join
         generate_series(1, 10, 1) gsx(x)
    group by x
    
    0 讨论(0)
  • 2021-01-28 16:54

    Having worked with DNA content databases for quite some, and the scientists, that like to play with them (sequences that is), I recommend a slight extension to the query by @GordonLinoff. Hide your implementation behind a function which takes 2 parameters: the length of the sequence and the number of sequences desired. You can then get any (reasonable) sequence length and any (reasonable) number of them.

    create or replace 
    function dna_sequence(
             sequence_length     integer
            ,number_of_sequences integer default 1 )
     returns table (dna_strand text) 
     language sql
    as $$ 
       select string_agg((array['A', 'C', 'G', 'T'])[1 + trunc(random() * 4)], '') 
         from generate_series(1, sequence_length , 1) gsn(n)  
        cross join generate_series(1, number_of_sequences, 1) gsx(x)
        group by x    
    $$;
    
    -- test
    select *
      from dna_sequence(20,10) ;
    
    select *
      from dna_sequence(250) ;
     
    select *
      from dna_sequence(20,100) 
     where position ('AAA' in dna_strand) > 0; 
    

    NOTE: The query in the above function is quite literally copy/paste Gordon Linoff's original. Then modified with only 2nd parameter changed in each generate series to the appropriate parameter value.

    0 讨论(0)
提交回复
热议问题