Any Python OLAP/MDX ORM engines?

后端未结

关注

 4  733

I\'m new to the MDX/OLAP and I\'m wondering if there is any ORM similar like Django ORM for Python that would support OLAP.

I\'m a Python/Django developer and if the

相关标签:

4条回答

夕颜

2020-12-13 23:38

I had a similar need - not for a full blown ORM but for a simple OLAP-like data store in Python. After coming up dry searching for existing tools I wrote this little hack:

https://github.com/kpwebb/python-cube/blob/master/src/cube.py

Even if it doesn't solve your exact need, it might be a good starting place for writing something more sophisticated.

0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2020-12-13 23:45
Django has some OLAP features that are nearing release.

Read http://www.eflorenzano.com/blog/post/secrets-django-orm/

http://doughellmann.com/2007/12/30/using-raw-sql-in-django.html, also

If you have a proper star schema design in the first place, then one-dimensional results can have the following form.
```
from myapp.models import SomeFact
from collections import defaultdict

facts = SomeFact.objects.filter( dimension1__attribute=this, dimension2__attribute=that )
myAggregates = defaultdict( int )
for row in facts:
    myAggregates[row.dimension3__attribute] += row.someMeasure
```
If you want to create a two-dimensional summary, you have to do something like the following.
```
facts = SomeFact.objects.filter( dimension1__attribute=this, dimension2__attribute=that )
myAggregates = defaultdict( int )
for row in facts:
    key = ( row.dimension3__attribute, row.dimension4__attribute )
    myAggregates[key] += row.someMeasure
```
To compute multiple SUM's and COUNT's and what-not, you have to do something like this.
```
class MyAgg( object ):
    def __init__( self ):
        self.count = 0
        self.thisSum= 0
        self.thatSum= 0

myAggregates= defaultdict( MyAgg )
for row in facts:
    myAggregates[row.dimension3__attr].count += 1
    myAggregates[row.dimension3__attr].thisSum += row.this
    myAggregates[row.dimension3__attr].thatSum += row.that
```
This -- at first blush -- seems inefficient. You're trolling through the fact table returning lots of rows which you are then aggregating in your application.

In some cases, this may be faster than the RDBMS's native sum/group_by. Why? You're using a simple mapping, not the more complex sort-based grouping operation that the RDBMS often has to use for this. Yes, you're getting a lot of rows; but you're doing less to get them.

This has the disadvantage that it's not so declarative as we'd like. It has the advantage that it's pure Django ORM.
0 讨论(0)
发布评论:

提交评论
- 加载中...
庸人自扰

2020-12-13 23:52

Same thing as kpw, I write my own stuff, except that it is exclusively for Django :

https://code.google.com/p/django-cube/

0 讨论(0)
发布评论:

提交评论
- 加载中...
花落未央

2020-12-13 23:52

There is also http://cubes.databrewery.org/ . Lightweight OLAP engine in python.

0 讨论(0)
发布评论:

提交评论
- 加载中...