How can I build a front end for querying a Redshift database (hopefully with Rails)

后端 未结 3 1435
梦谈多话
梦谈多话 2021-02-03 10:59

So I have a Redshift database with enough tables that it feels worth my time to build a front end to make querying it a little bit easier than just typing in SQL commands.

相关标签:
3条回答
  • 2021-02-03 11:24

    You might want to consider http://www.looker.com/. It's a frontend for exploring your DB, allowing easily savable queries and a GUI that the business guys can also use.

    0 讨论(0)
  • 2021-02-03 11:42

    This tutorial helps you setup a rails app with a redshift adapter:

    https://www.credible.com/code/setting-up-a-data-warehouse-with-aws-redshift-and-ruby/

    In a nutshell:

    Clone the sample app:

    git clone git@github.com:tuesy/redshift-ruby-tutorial.git
    cd redshift-ruby-tutorial
    

    Setup ENV Variables via ~/.bashrc (or dotenv):

    export REDSHIFT_HOST=redshift-ruby-tutorial.ccmj2nxbsay7.us-east-1.redshift.amazonaws.com
    export REDSHIFT_PORT=5439
    export REDSHIFT_USER=deploy
    export REDSHIFT_PASSWORD=<your password here>
    export REDSHIFT_DATABASE=analytics
    export REDSHIFT_BUCKET=redshift-ruby-tutorial
    

    Use the gem activerecord4-redshift-adapter, in Gemfile:

    'activerecord4-redshift-adapter', '~> 0.2.0' # For Rails 4.2
    'activerecord4-redshift-adapter', '~> 0.1.1' # For Rails 4.1
    

    Then you can query into redshift like you would with a normal AR model:

    bundle exec rails c
    RedshiftUser.count
    

    (Disclosure: I havn't yet given this method a try, but I may soon)

    0 讨论(0)
  • 2021-02-03 11:45
    #app/models/data_warehouse.rb
    class DataWarehouse < ActiveRecord::Base                      
      establish_connection "redshift_staging"
      #or, if you want to have a db per environment
      #establish_connection "redshift_#{Rails.env}"
    end
    

    Note that we are connecting on 5439, not the default 5432 so I specify the port Also, I specify a schema, beta, which is what we use for our unstable aggregates, you could either have a different db per environment as mentioned above, or use various schemas and include them in the search path for ActiveRecord

    #config/database.yml
    redshift_staging:                                                          
      adapter: postgresql                                                      
      encoding: utf8                                                           
      database: db03                                                         
      port: 5439                                                               
      pool: 5                                                                  
      schema_search_path: 'beta'                                                                                          
      username: admin                                                        
      password: supersecretpassword                                               
      host: db03.myremotehost.us  #your remote host here, might be an aws url from Redshift admin console 
    

    ###OPTION 2, a direct PG Connection

      class DataWarehouse < ActiveRecord::Base                      
    
        attr_accessor :conn                                                       
    
        def initialize                                                            
          @conn = PG.connect(                                                     
           database: 'db03',                                                   
           port: 5439,                                                           
           pool: 5,                                                              
           schema_search_path: 'beta',                                           
           username: 'admin',                                                  
           password: 'supersecretpassword',                                         
           host: 'db03.myremotehost.us'                                               
          )                                                                       
        end    
      end
    
    
    [DEV] main:0> redshift = DataWarehouse
    E, [2014-07-17T11:09:17.758957 #44535] ERROR -- : PG::InsufficientPrivilege: ERROR:  permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'
    (pry) output error: #<ActiveRecord::StatementInvalid: PG::InsufficientPrivilege: ERROR:  permission denied to set parameter "client_min_messages" to "notice" : SET client_min_messages TO 'notice'>   
    

    UPDATE:

    I ended up going with option 1, but using this adapter for now for multiple reasons:

    https://github.com/fiksu/activerecord-redshift-adapter

    Reason 1: ActiveRecord postgresql adapter sets client_min_messages Reason 2: adapter also attempts to set Time Zone, which redshift doesn't allow (http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html) Reason 3: Even if you change the code in ActiveRecord for the first two errors, you run into additional errors that complain that Redshift is using Postgresql 8.0, at that point I moved on to the adapter, will revisit and update if I find something better later.

    I renamed my table to base_aggregate_redshift_tests (notice plural) so ActiveRecord was easily able to connect, if you can't change your table names in redshift use the set_table method I have commented out below

    #Gemfile:
    gem 'activerecord4-redshift-adapter', github: 'aamine/activerecord4-redshift-adapter'
    

    Option 1

    #config/database.yml
    redshift_staging:                                                                                                             
      adapter: redshift                                                                                                           
      encoding: utf8                                                                                                              
      database: db03                                                                                                           
      port: 5439                                                                                                                  
      pool: 5                                                                                                                     
      username: admin                                                                                                
      password: supersecretpassword                                                                                                  
      host: db03.myremotehost.us                                                                                                       
      timeout: 5000   
    
    #app/models/base_aggregates_redshift_test.rb
    #Model named to match my tables in Redshift, if you want you can set_table like I have commented out below
    
    class BaseAggregatesRedshiftTest < ActiveRecord::Base
      establish_connection "redshift_staging"
      self.table_name = "beta.base_aggregates_v2"
    end
    

    in console using self.table_name -- notice it queries the right table, so you can name your models whatever you want

    [DEV] main:0> redshift = BaseAggregatesRedshiftTest.first                                                                    
    D, [2014-07-17T15:31:58.678103 #43776] DEBUG -- :   BaseAggregatesRedshiftTest Load (45.6ms)  SELECT "beta"."base_aggregates_v2".* FROM "beta"."base_aggregates_v2" LIMIT 1            
    

    Option 2

    #app/models/base_aggregates_redshift_test.rb
    class BaseAggregatesRedshiftTest < ActiveRecord::Base
      set_table "beta.base_aggregates_v2"
    
      ActiveRecord::Base.establish_connection(
        adapter: 'redshift',
        encoding: 'utf8',
        database: 'staging',
        port: '5439',
        pool: '5',
        username: 'admin',
        password: 'supersecretpassword',
        search_schema: 'beta',
        host: 'db03.myremotehost.us',
        timeout: '5000'
      )
    
    end
    
    #in console, abbreviated example of first record, now it's using the new name for my redshift table, just assuming I've got the record at base_aggregates_redshift_tests because I didn't set the table_name
    
    [DEV] main:0> redshift = BaseAggregatesRedshiftTest.first
    D, [2014-07-17T15:09:39.388918 #11537] DEBUG -- :   BaseAggregatesRedshiftTest Load (45.3ms)  SELECT "base_aggregates_redshift_tests".* FROM "base_aggregates_redshift_tests" LIMIT 1
    #<BaseAggregatesRedshiftTest:0x007fd8c4a12580> {
                                                    :truncated_month => Thu, 31 Jan 2013 19:00:00 EST -05:00,
                                                    :dma => "Cityville",
                                                    :group_id => 9712338,
                                                    :dma_id => 9999 
                                                    }
    

    Good luck @johncorser!

    0 讨论(0)
提交回复
热议问题