I have a request for some contract work from an organization that uses Excel as a database and wants to do some work on the Excel data via a real database. (Yeah, I know, ne
You may be interested in Excel 2007 Collaboration features (like editing an xls from the web).
The same API that's used by VBA is available through an external COM interface. There are quite a few books on the subject. I recommend the O'Reilly one by Steven Roman but your tastes may vary.
Excel is a 'COM Capable Application' and as such you can use COM to access and manipulate the data in an Excel document. You don't say what platform you are using - but if it's .NET then it's really very easy. See http://support.microsoft.com/kb/302084 for how to get started with C#.
If you're not using .net then any language that can interact with a COM component will work.
You don't specify a language, so if you are language agnostic .Net gives you some very powerful classes for data handling:
to open a csv file:
Imports System.Data.OleDb, Imports Excel = Microsoft.Office.Interop.Excel
Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + DataFolder + "\;Extended Properties='text;HDR=Yes'"
Dim conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
conn.Open()
Dim CommandText As String = CommandText = "select * from [" + CSVFileName + "]"
If Filter.Length > 0 Then
CommandText += " WHERE " + Filter
End If
Dim daAsset As New OleDbDataAdapter(CommandText, conn)
Dim dsAsset As New DataSet
daAsset.Fill(dsAsset, "Asset")
opening a sheet in a workbook is very similar - you specify the sheet name and can then fill a DataSet with the entire sheet - you can then access the Tables().Rows() of the DataSet to get each row and field, iterate over every row etc.
We're reading and manipulating Excel-Data via Apache POI, which is not complete in decoding Excel files (namely formula cells are not completely supported) but our customers are quite happy with us.
POI is a Java Library, so if you are a pure Windows shop there may be other more natural options, but as I said, our experience with POI is very good, people are happy.
Additionally: I believe to have heard of Excel ODBC drivers - maybe this is what you want/need? (Sorry, I've never worked with them)
Another approach would be to write an excel function that talks to the database directly and returns the result as an array.
If you think this approach would work well you could try XLLoop - this allows you to easily write excel functions in Java, Python, Ruby, Perl, R, Lisp, Erlang.