Transferring data from Microsoft Office
You can transfer data from Microsoft Access tables, Microsoft Excel worksheets, and SQL Server tables using the ADO DSC, which itself uses non-UNICOM Intelligence Data Model OLE DB providers to read data sources. Using the ADO DSC to read a data source is always preferable to using a non-UNICOM Intelligence Data Model OLE DB provider directly because the ADO DSC presents the data using the standard VDATA virtual table. Therefore, you do not need to specify a select query in the InputDataSource section as it will default to SELECT * FROM vdata. In addition, using the ADO DSC allows you to use an input metadata source, which means that you can then create an output metadata document (.mdd) file when you run the transfer. If you are exporting to a Quantum file, using an input metadata source means that you will be able to allocate the card, column, and punch definitions in the OnAfterMetaDataTransformation event.
Not all non-UNICOM Intelligence Data Model providers can be used with the ADO DSC, but here are some that can be used:
▪Microsoft OLE DB Provider for ODBC Drivers. Can read Access tables and Excel worksheets.
▪Microsoft Jet 4.0 OLE DB Provider. Can also read Access tables and Excel worksheets. However, this provider always returns variables in alphabetical order, which may not be what you require.
For more information about the types of providers that the ADO DSC supports, see
ADO DSC. For information about using providers that are not supported by the ADO DSC, see
Reading data using other OLE DB providers.
Providing a metadata source
You can optionally use a .adoinfo file as the metadata source for your transfer. An .adoinfo file does not contain actual metadata, instead it contains an ADO connection string that identifies the data source and the name of the table to be read. Having a metadata source is recommended as it will allow you to create a metadata document (.mdd) file in the OutputDataSource section of your data management script and also allow you to define new variables in the Metadata section. In addition, if you are exporting to a Quantum file, having a metadata source means that you will be able to allocate the card, column, and punch definitions during the export.
It is not possible to use a metadata document (.mdd) file as a metadata source when using the ADO DSC.
How do I specify reading from an MS Office file?
1. Using ADO DSC to read the metadata
In the connection string in the
InputDataSource section, specify
mrADODsc for both the
Data Source and the
MR Init MDSC connection properties and specify the name and location of the .
adoinfo file for the
Initial Catalog connection property:
InputDatasource(myInputDataSource, "My input data source")
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrADODsc; _
Initial Catalog=<Name and location of .adoinfo file>; _
MR Init MDSC=mrADODsc"
End InputDatasource
For examples of .adoinfo files for Access, SQL Server tables, and Excel worksheets, see
Example .adoinfo files.
2. Reading case data only
In the connection string in the
InputDataSource section, specify
mrADODsc for the
Data Source connection property, specify the ADO connection string for the
Location connection property, and specify the name of the table to be read (or if Excel, the worksheet name) for the
MR Init Project connection property:
InputDatasource(myInputDataSource, "My input data source")
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrADODsc; _
Location=<ADO connection string>; _
MR Init Project=<table name>"
End InputDatasource
The value of the ADO connection string depends on both the provider being used and the type of data source. In this situation, you may find it easier to use the Connection String Builder in UNICOM Intelligence Professional to generate the value for ConnectionString. For more information, see
Using the Connection String Builder.
Reading an Excel worksheet
The data on the worksheet needs to be suitable. By default, the first row in the worksheet defines the variable names and the provider will examine the cell contents of the next few rows to determine the appropriate data type for each variable. The provider will change the variable names if they contain characters that are not supported. For example, a column called Respondent.Serial will become Respondent#Serial in the output, and a column called When Decided will become whenDecided.
Data types and sizes
The provider will assign a maximum length of 255 characters to any worksheet column that it identifies as a text variable. Any column that contains only numbers, whether integer or decimal, will be deemed to be a 64-bit floating point number with at least 15 digits of precision. However, if you have defined a metadata source, you can change the types and sizes of variables during the transfer by adding code to the
OnAfterMetaDataTransformation Event section of your data management script. For example:
Event(OnAfterMetaDataTransformation, "Change types and sizes of columns")
Dim MDM
Set MDM = CreateObject("MDM.Document")
MDM.Open(dmgrJob.TransformedOutputMetaData[0])
' Change decimal column "MyFirstColumn" to an integer
' with a valid range between 0 and 99...
MDM.Variables["MyFirstColumn"].DataType = DataTypeConstants.mtLong
MDM.Variables["MyFirstColumn"].Variable.RangeExpression = "[0..99]"
' Restrict the valid range of decimal column "MySecondColumn"...
MDM.Variables["MySecondColumn"].Variable.RangeExpression = "[-999.999..999.999]"
' Restrict the size of text column "MyThirdColumn" to
' between 0 and 25 characters...
MDM.Variables["MyThirdColumn"].Variable.RangeExpression = "[0..25]"
MDM.Save()
MDM.Close()
End Event
Worksheet names
Depending on the version of Excel installed on your computer, you may have to add a dollar sign ($) after the worksheet name to be able read the worksheet successfully. The worksheet name is specified in the .adoinfo file or, if the transfer is case data only, by the MR Init Project connection string property. For example, to read the worksheet called Sheet1, you may have to specify Sheet1$. You may also be able to specify a range of cells in the worksheet. For example, specifying the worksheet name as Sheet1$A1:C11 will read the range A1:C11. If you specify a range, make sure that the first row contains the variable names.
Examples
1. Microsoft Access
This example transfers to a SAV file all of the data in the Authors sample Access database. The example uses the UNICOM Intelligence ADO DSC and the Microsoft OLE DB Provider for ODBC Drivers to read the Access database. This example does not use an input metadata source so it is not possible to create an output metadata document (.mdd) file. The name of the Access table to be read has been specified using the MR Init Project custom connection property.
InputDataSource(Input)
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrADODsc; _
Location=""Provider=MSDASQL.1; _
Persist Security Info=False; _
Extended Properties='DSN=MS Access Database; _
DBQ=C:\Inetpub\iissamples\sdk\asp\database\Authors.mdb; _
DriverId=281; _
FIL=MS Access; _
MaxBufferSize=2048; _
PageTimeout=5; _
UID=admin'""; _
MR Init Project=Authors"
End InputDatasource
OutputDataSource(Output)
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrSavDsc; _
Location=C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\MSAccessToSav.sav"
End OutputDataSource
The UNICOM Intelligence Developer Documentation Library includes two sample data management script files that transfer data from Access databases. The files are called are
MSAccessToQuantum.dms and
MSAccessToSav.dms (which is similar to the above example). For more information, see
Sample DMS files that integrate with Microsoft Office.
2. Microsoft Excel
The following example shows how to set up the InputDataSource section to read data on the first worksheet in an Excel file. The example uses the UNICOM Intelligence ADO DSC and the Microsoft OLE DB Provider for ODBC Drivers to read the Excel file. A .adoinfo file is used as an input metadata source as this allows the output metadata document (.mdd) to be created. When using a .adoinfo file, the names of the Excel file and the worksheet are specified in that file. You must set the MR Init MDSC custom connection property to specify that the UNICOM Intelligence ADO metadata source component (MDSC) should be used to read the metadata.
#define ADOInfoFile "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\MSExcelToDDF.adoinfo"
' The output Data Collection Data and MDD files...
#define OutputDDFFile "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\MSExcelToDDF.ddf"
#define OutputMDDFile "C:\Program Files\IBM\SPSS\DataCollection\7\DDL\Output\MSExcelToDDF.mdd"
InputDataSource(Input)
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrADODsc; _
MR Init MDSC=mrADODsc; _
Initial Catalog=" + ADOInfoFile
SelectQuery = "SELECT * FROM vdata"
End InputDatasource
OutputDataSource(Output)
ConnectionString = "Provider=mrOleDB.Provider.2; _
Data Source=mrDataFileDsc; _
Location=" + OutputDDFFile
MetaDataOutputName = OutputMDDFile
End OutputDataSource
Here are the contents of the .adoinfo file:
<ADODSC>
<ConnectionString>
DSN=Excel Files;
DBQ=C:\samples\MyMuseumData.xls;
DriverId=790;
MaxBufferSize=2048;
PageTimeout=5
</ConnectionString>
<Table>
Sheet1
</Table>
</ADODSC>
Depending on the version of Excel installed on your computer, you may have to specify the worksheet (“table”) name as Sheet1$.
This example is provided as a sample data management script file called
MSExcelToDDF.dms, which is installed with the UNICOM Intelligence Developer Documentation Library. A similar sample script that transfers Excel data to Quantum is
MSExcelToQuantum.dms. For more information, see
Sample DMS files that integrate with Microsoft Office.
See also