Creating CSV Files Using BCP and Stored Procedures

This article focuses on the use of the Bulk Copy Program (BCP) to create CSV files. Although it is possible to create a CSV file using DTS or SSIS, using BCP is often simpler and more efficient.

I use master..sysobjects as an example table to extract.

Create a simple CSV file

The simplest way to copy data from a database table to file is to use the basic BCP command:

The basic format for the BCP command for creating a CSV file is as follows:

BCP <table> out <filename> <switches>

The switches used here are:

  • -c Output in ASCII with the default field terminator (tab) and row terminator (crlf)
  • -t override the field terminator with “,”
  • -T use a trusted connection. Note that U -P may be used for username/password
  • -S connect to this server to execute the command

Note that, like DTS/SSIS, BCP is a client utility, hence you need to supply the connection information.

For transfer of data between SQL servers, in place of -c, use -n or -N for native data format (-N = Unicode). This is much faster and avoids data conversion problems. Please refer to the previous BOL link for the complete format of the BCP command.

As BCP is a command line utility it is executed from T-SQL using xp_cmdshell. Create a directory called BCP on your c: drive and execute:

Other field and row delimiters

Often, character data includes commas which will be interpreted in the file as a field terminator. A simple way to cater for this is to use a different field delimiter which does not appear in the data. Commonly used characters are pipe (|):

And caret (^):

The terminators are not limited to a single character, so if necessary you can use (|^):

Note that this will increase the size of the file and so slow down the import/export. Another way to cater for embedded commas is to “quote encapsulate” character data – more about that later.

It is unusual to need to change the row terminator from the default of crlf (carriage return, line feed) but occasionally you will need the cr or lf on its own.

To do this use the hex value cr = 13 = 0x0Dlf = 10 = 0x0A. If you can’t remember these values they are easily obtained, as follows:

To use these in the BCP command for the row terminator the -r switch is used:

When the resulting file is opened in notepad the row terminators will not end the line – the character should appear as a non-ASCII character (a little oblong). If opened or copied to query analyser (or management studio) these will be interpreted as line breaks and the file will be more readable.

The terminator characters can be encapsulated with double quotes in the command to allow for space. I like to do this for anything other than a single character delimiter. The following example gives a crlf row terminator and | ^ field terminator.

Formatting the extracted data

If you do not require all the columns/rows, or you wish to alter the data, a simple method is to create a view. This can be used in the BCP command in the same way as a table. The following view allows you to extract just the name, the formatted create date and time and then order the results according the date created (most recent first):

Now we can quote encapsulate the extracted data by including the formatting in the view:

Note:
Quote encapsulation can also be performed by use of a format file.

Complete control – stored procedures

If you examined the BCP syntax in BOL, you might have noticed that it is possible to extract from a query by using the queryout keyword in place of out.

So, for example, an equivalent but neater version of the previous code extract would place the ORDERBY clause in the BCP statement rather than the view:

Of course, strictly speaking, the view was not necessary at all as the query could have been included in the BCP statement but that can get difficult to maintain.

The queryout method allows you to BCP from the result of a stored procedure, which opens up a lot of possibilities and offers a lot of control over the file format. For anything other than a simple table extract I would tend to use this method rather than a view. I would also format each line within the stored procedure. This means that the formatting can be tested independently from the file creation.

Employing this technique, the above extract becomes:

More complex formatting

We can now change the format of the extracted data to anything we require. For example, to include column names at the top and a rowcount at the end:

This stored procedure will format the data as required, but when it is run via the BCP command it receives an error:

“Invalid object name ‘#a’”

Using the profiler you can see that when using queryout, the query is called twice. BCP tries to get the result set format by calling the stored procedure with fmtonly. This does not execute the stored procedure but returns the format of the resultset. Unfortunately, it does not work if the stored procedure creates and uses a temp table. This problem can be circumvented by including a set fmtonly offcommand in the execution:

Note that the column terminator is not used since the resultset is a single column.

Editor’s Note
This was the only piece of code that I failed to get working. The stored procedure executed fine outside the BCP, but when I tried to run the above command I received an error:

SQLState = HY010, NativeError = 0
Error = [Microsoft][SQL Native Client]Function sequence error
NULL

The author was unable to replicate this error. Is anyone aware of what might be causing this on my machine?
End Editor’s Note

Look at this execution using the profiler and you will see set fmtonly off followed by set fmtonly on. There is a potential problem with this though: it means that the stored procedure is executed twice, once for the format and once to extract the data – due to the fmtonly setting, both these calls will produce a resultset. This needs to be taken into account when considering how long the process will take and its impact on your system.

As the stored procedure is executed twice it should not contain any updates of permanent data. Particularly if data is flagged as exported then the flagging will be performed on the first call and no data will be extracted for the second call. All in all, this method should be used with caution and comments (warnings) placed in the stored procedure for the aid of future maintenance.

To get round all of these problems, use the stored procedure to insert into a table then, from the BCP, extract from that table. This gives you added features that can be valuable on systems that are performing a lot of extracts:

  • It gives a record of the data extracted – if the file is lost it can be recreated.
  • The data extracted can be presented to a user for viewing from the database.
  • If there are problems with the file, the support can see the data without needing access to the file.

For this process, the format stored procedure, s_bcpMasterSysobjects, will insert into the table using an identifier and the BCP will extract using the same identifier. This process can be controlled be (a control) stored procedure which would allocate the identifier and pass it to the format stored procedure which inserted the extracted data into a table with that identifier. It then calls another stored procedure or in-line code to create files (via BCP) for each identifier which is not marked as complete. The identifier is flagged as complete after the file creation. In this way the file creation becomes independent of the extract of the data.

This identifier can be passed to both processes from a controlling stored procedure or the format stored procedure can get the next available identifier and the extract can flag the identifier as extracted – so it extracts everything that has not been extracted, allowing the export to be independent of the extract.

Which technique you use will depend on your system. The following code demonstrates the use of a controlling stored procedure:

Now the data is extracted via:

You can view the data extracted via:

Alternatively, you can use the following, which will also give the date that the data was extracted:

Note:
For the extract I have given three sort columns Seq1, Seq2, Seq3 which are all order ascending. This means that the extracting stored procedure needs to place the values in ascending order (see the date manipulation). This could have ascending and descending sort columns or the use could be dependent on the extract type.

For a system which is performing a lot of small exports I will usually implement this process, and a similar process for imports.

Extracting all tables from a database

This is usually performed to transfer data between servers/databases and so should use the native data format. The easiest way is to create the BCP statements from a query then execute them:

The result of this will be a series of BCP statements of the form:

which will extract all tables in the database. To import just change the “out” to “in”.

Summary

We have seen how to use BCP to perform a simple extract of a table to a CSV file, how to perform slightly more complex extracts using a view, extracting using a stored procedure to format the data and a system which allows the data to be extracted and saved in a table and viewed/exported from there.

Advertisements

C# – SSIS Dynamic package loading

C# – SSIS Dynamic package loading. Load a SSIS package using .net c#

 

using System;
using Microsoft.SqlServer.Dts.Runtime;

namespace ST_43cbf51a82014806a9fa9aeec94bdb08
{

[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

class MyEventListener : DefaultEvents
{
public override bool OnError(DtsObject source, int errorCode, string subComponent,
string description, string helpFile, int helpContext, string idofInterfaceWithError)
{
// Add application-specific diagnostics here.
Console.WriteLine(“Error test in {0}/{1} : {2}”, source, subComponent, description);
return false;
}
}
public void Main()
{
string pkgLocation;
Package pkg;
Microsoft.SqlServer.Dts.Runtime.Application app;
DTSExecResult pkgResults;

MyEventListener eventListener = new MyEventListener();

pkgLocation = Dts.Variables[“User::package_directory”].Value.ToString() + Dts.Variables[“User::package_name”].Value.ToString();

app = new Microsoft.SqlServer.Dts.Runtime.Application();
pkg = app.LoadPackage(pkgLocation, eventListener);
pkgResults = pkg.Execute(null, null, eventListener, null, null);

foreach (ConnectionManager conn in pkg.Connections)
{
Dts.Variables[“User::package_connection_string”].Value = conn.ConnectionString.ToString();
}
}
}
}

SSIS: Package using SQL Authentication and DontSaveSensitive as ProtectionLevel

In this post i am trying to look into steps to follow while running SSIS package using SQL Authentication and DontSaveSensitive as ProtectionLevel.

I created a simple SSIS package using a connection manager going to SQL Server using SQL authentication, I have my ProtectionLevel as EncryptSensitiveWithUserKey

clip_image001

clip_image003

My package has an OLEDB Source (using connection manager above) which queries a table in database and dumps few rows in flat file.

clip_image005

clip_image006

With above settings ,when I execute my package, Its works like a charm.

clip_image007

Now, I changed the ProtectionLevel  to DontSaveSensitive

clip_image008

And package failed

clip_image009

With following error

SSIS package “Package.dtsx” starting.

Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning.

Error: 0xC0202009 at Package, Connection manager “runeet2k8.sa”: An OLE DB error has occurred. Error code: 0x80040E4D.

An OLE DB record is available.  Source: “Microsoft SQL Native Client”  Hresult: 0x80040E4D  Description: “Login failed for user ‘sa’.”.

Error: 0xC020801C at Data Flow Task, OLE DB Source [1]: The AcquireConnection method call to the connection manager “runeet2k8.sa” failed with error code 0xC0202009.

Error: 0xC0047017 at Data Flow Task, DTS.Pipeline: component “OLE DB Source” (1) failed validation and returned error code 0xC020801C.

Error: 0xC004700C at Data Flow Task, DTS.Pipeline: One or more component failed validation.

Error: 0xC0024107 at Data Flow Task: There were errors during task validation.

SSIS package “Package.dtsx” finished: Failure.

What we see above is expected as we have sensitive information in our package (password for SQL account) and have chosen DontSaveSensitive as protection level. The affect is Password being cleared from our connection manager hence login failure.

For this situation, We would have to save this password somewhere outside the package. I chose to save it in configuration file as shown below. I used XML Configuration file

clip_image011

clip_image012

I had to chose correct connection manager, Its called runeet2k8.sa in my case

clip_image013

This is the XML configuration file that BIDS wrote for me

<?xml version=”1.0″?>

<DTSConfiguration>

      <DTSConfigurationHeading>

            <DTSConfigurationFileInfo GeneratedBy=”Domain\UserName” GeneratedFromPackageName=”Package” GeneratedFromPackageID=”{77FB98FB-E1AF-48D9-8A43-9FD6B1790837}” GeneratedDate=”22-12-2009 16:12:59″/>

      </DTSConfigurationHeading>

      <Configuration ConfiguredType=”Property” Path=”\Package.Connections[runeet2k8.sa].Properties[Password]” ValueType=”String”>

                  <ConfiguredValue></ConfiguredValue>

      </Configuration>

</DTSConfiguration>

 

I have to go in and type password value.

 

<?xml version=”1.0″?>

<DTSConfiguration>

      <DTSConfigurationHeading>

            <DTSConfigurationFileInfo GeneratedBy=”FAREAST\runeetv” GeneratedFromPackageName=”Package” GeneratedFromPackageID=”{77FB98FB-E1AF-48D9-8A43-9FD6B1790837}” GeneratedDate=”22-12-2009 16:12:59″/>

      </DTSConfigurationHeading>

      <Configuration ConfiguredType=”Property” Path=”\Package.Connections[runeet2k8.sa].Properties[Password]” ValueType=”String”>

  <ConfiguredValue>Password</ConfiguredValue>

</Configuration>

</DTSConfiguration>

Now i have a package with ProtectionLevel as DontSaveSensitive and password value for connection manager saved in a configuration file outside the package. Package executed absolutely fine after this.

clip_image014

I went ahead and set this package as Integration Services Job under SQL Agent.

clip_image015

Executed the job and it ran successfully.

clip_image016

One important thing to note is that if you are moving your package from your development machine to say test server for execution. You would have to move your configuration file as well and put it at same path on target machine where its placed on your development box.

For example, on my box, the config file PackageConfig.dtsconfig  is placed at D:\Cases\Integration Services Project1\PackageConfig.dtsconfig and if I am moving my package from my development machine to somewhere else, I would have to move my config file as well and place it at location D:\Cases\Integration Services Project1\ on that machine.

If you don’t have dtsconfig at expected location, Job would fail.

If you done want to create same complex folder path for dtsconfig on target machine as its on your dev box (eg: D:\Cases\Integration Services Project1\) You could place this dtsconfig file anywhere you want and give the pointer of this path from Configuration tab  under you SSIS job step.

clip_image001[4]

SSIS Error: Violation of primary key constraint when no duplicate exists 0x80040E2F

Scenario:

  1. I have a source table with dbo.Codes with primary key combination(CodeID, CodeName).
  2. In order to create my destination table, I have used the SQL Server–>Database–>Tasks–>Generate Scripts, option.
  3. Then, in the SSIS package, I simply used the OLE DB Source and OLE DB Destination element. It was returning error: “Violation of PRIMARY KEY constraint ‘pkCodes’. Cannot insert duplicate key in object ‘dbo.Codes’. The duplicate key value is (106, da7A).”.

What I have tried to solve:

  1. I have tried to use the sql command as source: Select CodeID, CodeName from dbo.Codes GROUP BY CodeID, CodeName. It was still returning error. I got confused at this point.
  2. After searching online, I found a tips to add a SSIS Toolbox–>Common–>Sort element between OLE DB Source and OLE DB Destination. I checked the option “Remove rows with duplicate sort values” in the Sort element. I was still getting error.
  3. Then, I had enabled the data viewer between OLE DB Source and Sort element. I could see there are two rows: 106, da7a and 106, da7A in the source.

What is the real problem?

My source table c0lumn CodeName is case sensitive but my CodeName column in the destination table is not case sensitive. This has occured because, sql server generate script option–>Script Collation is set to false by default.

Solution:

I recreated my destination table with Script Collation option: True, which made my destination column case sensitive and it has solved my problem.

 

SSIS C#: Create Excel file, worksheet, header row dynamically

I need to

  1. create a excel file,
  2. which will multiple worksheets.
  3. every worksheet will have a header row.

Excel file name, worksheet name, header row column names will be provided dynamically.

ScriptMain.cs file

 

namespace ST_50e9d5ec3f17426585feb2e6ab8d41bb{  [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase {        public static StringBuilder informationLog = new StringBuilder();        //Log log;        public void Main() {
//Dts.TaskResult = (int)ScriptResults.Success;            try            {                SqlConnection sqlConnection = new SqlConnection();                sqlConnection = ConnectToDataSource(“ADONET_IntegratedCare”);
DataTable dtControl = new DataTable();                dtControl =GetControlInfo(sqlConnection);
ClearFolder(dtControl);                SQLToExcel(sqlConnection, dtControl);
//MoveFile(dtControl);                sqlConnection.Close();                dtControl.Clear();            }            catch(Exception x)            {                StringBuilder errorMessage = new StringBuilder();                Exception e = x;                while (e != null)                {                    errorMessage.AppendFormat(“Exception occured [{0}]\n”, e.Source, e.Message);                    errorMessage.AppendFormat(“Stack trace: \n{0}\n”, e.StackTrace);                    errorMessage.AppendLine();                    e = e.InnerException;                }                //log.Error(“Adhoc – exception”, errorMessage.ToString());            }        } } }

ScriptMainDataAccess.cs

using Microsoft.SqlServer.Dts.Runtime;
using System;
using System.Data;
using System.Data.SqlClient;

namespace ST_50e9d5ec3f17426585feb2e6ab8d41bb
{
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region private SqlConnection ConnectToDataSource(SqlConnection sqlConnection)
private SqlConnection ConnectToDataSource(String dataSource)
{
SqlConnection sqlConnection = new SqlConnection();
try
{
using (ConnectionManager adonetConnection = (ConnectionManager)Dts.Connections[dataSource])
{
sqlConnection = (SqlConnection)adonetConnection.AcquireConnection(Dts.Transaction);
if (sqlConnection.State == ConnectionState.Open)
{
sqlConnection.Close();
}
sqlConnection.Open();
//Log.Information(“Adhoc”, String.Format(“Connect to the data source: {0}.{1}”, sqlConnection.DataSource, sqlConnection.Database));
};
}
catch (Exception e)
{
throw e;
}
return sqlConnection;
}
#endregion

#region private DataTable GetControlInfo(SqlConnection sqlConnection)
private DataTable GetControlInfo(SqlConnection sqlConnection)
{
SqlDataAdapter da = new SqlDataAdapter();
DataSet dsControl = new DataSet();
DataTable dtControl = new DataTable();
SqlCommand sqlCommand;
try
{

String controlStatement = Dts.Variables[“$Package::ControlStatement”].Value.ToString();
String dataType = Dts.Variables[“$Package::DataType”].Value.ToString();

sqlCommand = new SqlCommand(controlStatement, sqlConnection);
da.SelectCommand = sqlCommand;
da.Fill(dsControl, “Adhoc_DataTable”);
dtControl = dsControl.Tables[“Adhoc_DataTable”];
//log.Information(“Adhoc – Information”, “Get control information from the package parameter: $Package::ControlStatement”);
}
catch (Exception e)
{
throw e;
}
return dtControl;
}
#endregion
}
}

ScriptMainFile.cs

using System;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.Linq;
using System.Text;

namespace ST_50e9d5ec3f17426585feb2e6ab8d41bb
{
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region private void ClearFolder(DataTable dtControl)
private void ClearFolder(DataTable dtControl)
{
//delete all the files from a directory
try
{
foreach (DataRow dr in dtControl.Rows)
{
String path = dr[“DataProcessingDirectory”].ToString();
if (Directory.Exists(path))
{
DirectoryInfo di = new DirectoryInfo(path);
foreach (FileInfo file in di.GetFiles())
{
file.Delete();
//log.Information(“Adhoc-information”, string.Format(“Deleted file {0} “, path+file));
}
}
else
{
Directory.CreateDirectory(path);
};
}
}
catch (Exception e)
{
throw e;
}
}
#endregion

#region private void MoveFile(DataTable dtControl)
private void MoveFile(DataTable dtControl)
{
try
{
foreach (DataRow dr in dtControl.Rows)
{
String sourcePath = dr[“DataProcessingDirectory”].ToString();
String destinationPath = dr[“FilePath”].ToString();
if (Directory.Exists(destinationPath) == false)
{
Directory.CreateDirectory(destinationPath);
}
DirectoryInfo di = new DirectoryInfo(sourcePath);
foreach (FileInfo fileName in di.GetFiles())
{
if (File.Exists(destinationPath + fileName))
{
File.Delete(destinationPath + fileName);
}
fileName.MoveTo(destinationPath + fileName);
//log.Information(“Adhoc-information”, string.Format(“Move the file from {0} to {1}”, sourcePath + fileName, destinationPath + fileName));
}
}
}
catch (Exception e)
{
throw e;
}

}
#endregion
}
}

ScriptMainExcel.cs

using System;
using System.Data;
using System.Data.SqlClient;
using System.IO;

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.Linq;
using System.Windows.Forms;
using System.Collections.Generic;
using System.Runtime.InteropServices;

using Excel = Microsoft.Office.Interop.Excel;
using X14 = DocumentFormat.OpenXml.Office2010.Excel;
using X15 = DocumentFormat.OpenXml.Office2013.Excel;

namespace ST_50e9d5ec3f17426585feb2e6ab8d41bb
{
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
//MessageBox.Show(fileExtension);

 

#region private void SQLToExcel(SqlConnection sqlConnection, DataTable dtControl)
private void SQLToExcel(SqlConnection sqlConnection, DataTable dtControl)
{
try
{
foreach (DataRow dr in dtControl.Rows)
{
String filePath = dr[“DataProcessingDirectory”].ToString();
String workbookNamePostfix = dr[“WorkbookNamePostfix”].ToString();
String fileExtension = dr[“FileExtension”].ToString().ToLower();
String worksheetName = dr[“WorksheetName”].ToString();
String sqlStatement = dr[“SqlStatement”].ToString();
String fileName = filePath
+ workbookNamePostfix
+ DateTime.Now.ToString(“yyyyMMdd”)
+ fileExtension
;
SqlDataAdapter da = new SqlDataAdapter();
DataSet dsResult = new DataSet();
DataTable dtResult = new DataTable();
SqlCommand sqlCommand = new SqlCommand(sqlStatement, sqlConnection);
da.SelectCommand = sqlCommand;
da.Fill(dsResult, “Adhoc_Result”);
dtResult = dsResult.Tables[“Adhoc_Result”];
if (fileExtension == “.xlsx”)
{
if (File.Exists(fileName))//if file already exist
{
OpenExcelFile_OpenXml(dtResult, fileName, worksheetName);
}
else//if file not exist
{
CreateExcelFile_OpenXml(dtResult, fileName, worksheetName);
}
/*The auto-fit logic is something which is implemented by Microsoft Excel, and is not a part of the OpenXML spreadsheet format.
* Auto-fit involves measuring the width (or height) of the value in each cell and finding the maximum value.
* In order to implement auto-fit in your own code, you will have to manually measure the text.*/
ExcelStyle_InteropExcel(fileName, worksheetName);
}

}
}
catch (Exception e)
{
throw e;
}
}
#endregion

#region private void CreateExcelFile_OpenXml(DataTable dtResult, string fileName, string worksheetName)
private void CreateExcelFile_OpenXml(DataTable dtResult, string fileName, string worksheetName)
{
try
{
SpreadsheetDocument spreadsheetDocument;
WorkbookPart workbookPart;
WorksheetPart worksheetPart;
WorkbookStylesPart wbsp;
Sheets sheets;
Sheet sheet;
int rowNumber = dtResult.Rows.Count;
int columNumber = dtResult.Columns.Count;

 

// Create a spreadsheet document by supplying the filepath.
// By default, AutoSave = true, Editable = true, and Type = xlsx.
spreadsheetDocument = SpreadsheetDocument.Create(fileName, SpreadsheetDocumentType.Workbook);

// Add a WorkbookPart to the document.
workbookPart = spreadsheetDocument.AddWorkbookPart();
workbookPart.Workbook = new Workbook();

// add styles to sheet
wbsp = workbookPart.AddNewPart<WorkbookStylesPart>();
wbsp.Stylesheet = GenerateWorkbookStylesPart1Content();

// generate rows
SheetData sheetData = CreateSheetData_OpenXml(dtResult);

// Add a WorksheetPart to the WorkbookPart.
worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
Worksheet worksheet = new Worksheet();
worksheet.Append(sheetData);
worksheetPart.Worksheet = worksheet;

wbsp.Stylesheet.Save();

/*set a auto filter*/
AutoFilter autoFilter = new AutoFilter() { Reference = (“A1:” + ExcelColumnNumberToName(columNumber.ToString()) + “1”) };
worksheet.Append(autoFilter);

// Add Sheets to the Workbook.
sheets = workbookPart.Workbook.AppendChild(new Sheets());

// Append a new worksheet and associate it with the workbook.
sheet = new Sheet() { Id = workbookPart.GetIdOfPart(worksheetPart), SheetId = 1, Name = worksheetName };

sheets.Append(sheet);
workbookPart.Workbook.Save();

//string cellReference = GetCellReference(@”c:\temp\test.xlsx”, “Sheet1”, 1, “Quantity”);
//Console.WriteLine(GetColumnName(cellReference)); //prints D for your example

 

// Close the document.
//spreadsheetDocument.WorkbookPart.Workbook.Save();
spreadsheetDocument.Close();

}// try

catch (Exception e)
{
throw e;
}
}
#endregion

#region private void OpenExcelFile_OpenXml(DataTable dtResult, string fileName, string worksheetName)
private void OpenExcelFile_OpenXml(DataTable dtResult, string fileName, string worksheetName)
{
try
{
SpreadsheetDocument spreadsheetDocument;
WorksheetPart worksheetPart;
Sheets sheets;
Sheet sheet;
int rowNumber = dtResult.Rows.Count;
int columNumber = dtResult.Columns.Count;

// open a spreadsheet document to modify
spreadsheetDocument = SpreadsheetDocument.Open(fileName, true);

// generate rows
SheetData sheetData = CreateSheetData_OpenXml(dtResult);

// Add a WorksheetPart to the WorkbookPart.
worksheetPart = spreadsheetDocument.WorkbookPart.AddNewPart<WorksheetPart>();
Worksheet worksheet = new Worksheet();
worksheet.Append(sheetData);
worksheetPart.Worksheet = worksheet;

/*set a auto filter*/
AutoFilter autoFilter = new AutoFilter() { Reference = (“A1:” + ExcelColumnNumberToName(columNumber.ToString()) + “1”) };
worksheet.Append(autoFilter);

// Add Sheets to the Workbook.
sheets = spreadsheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>();
string relationshipId = spreadsheetDocument.WorkbookPart.GetIdOfPart(worksheetPart);

// Get a unique ID for the new worksheet.
uint sheetId = 1;
if (sheets.Elements<Sheet>().Count() > 0)
{
sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}

// Append the new worksheet and associate it with the workbook.
sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = worksheetName };

sheets.Append(sheet);
//worksheetPart.Worksheet.Save();

// Close the document.
//spreadsheetDocument.WorkbookPart.Workbook.Save();
spreadsheetDocument.Close();

}// try

catch (Exception e)
{
throw e;
}
}
#endregion

#region private SheetData CreateSheetData_OpenXml(DataTable dtResult)
private SheetData CreateSheetData_OpenXml(DataTable dtResult)
{
SheetData sheetData = new SheetData();
//worksheetPart.Worksheet = new Worksheet(sheetData);

Row headerRow = new Row();

List<String> columns = new List<string>();
foreach (System.Data.DataColumn column in dtResult.Columns)
{
columns.Add(column.ColumnName);
Cell cell = new Cell() { StyleIndex = (UInt32Value)2U };
cell.DataType = CellValues.String;
cell.CellValue = new CellValue(column.ColumnName);
//MessageBox.Show(“”);
headerRow.AppendChild(cell);
}

sheetData.AppendChild(headerRow);

foreach (DataRow dsrow in dtResult.Rows)
{
Row newRow = new Row();
foreach (String col in columns)
{
//Cell cell = new Cell();
Cell cell = new Cell() { StyleIndex = (UInt32Value)1U };
cell.DataType = CellValues.String;
cell.CellValue = new CellValue(dsrow[col].ToString());
newRow.AppendChild(cell);
}
sheetData.AppendChild(newRow);
}
return sheetData;
}
#endregion

#region private Stylesheet GenerateWorkbookStylesPart1Content()
// Generates content of workbookStylesPart1.
private Stylesheet GenerateWorkbookStylesPart1Content()
{
Stylesheet stylesheet1 = new Stylesheet() { MCAttributes = new MarkupCompatibilityAttributes() { Ignorable = “x14ac” } };
stylesheet1.AddNamespaceDeclaration(“mc”, “http://schemas.openxmlformats.org/markup-compatibility/2006&#8221;);
stylesheet1.AddNamespaceDeclaration(“x14ac”, “http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac&#8221;);

Fonts fonts1 = new Fonts() { Count = (UInt32Value)2U, KnownFonts = true };

Font font1 = new Font();
FontSize fontSize1 = new FontSize() { Val = 11D };
Color color1 = new Color() { Theme = (UInt32Value)1U };
FontName fontName1 = new FontName() { Val = “Calibri” };
FontFamilyNumbering fontFamilyNumbering1 = new FontFamilyNumbering() { Val = 2 };
FontScheme fontScheme1 = new FontScheme() { Val = FontSchemeValues.Minor };

font1.Append(fontSize1);
font1.Append(color1);
font1.Append(fontName1);
font1.Append(fontFamilyNumbering1);
font1.Append(fontScheme1);

Font font2 = new Font();
Bold bold1 = new Bold();
FontSize fontSize2 = new FontSize() { Val = 11D };
Color color2 = new Color() { Theme = (UInt32Value)1U };
FontName fontName2 = new FontName() { Val = “Calibri” };
FontFamilyNumbering fontFamilyNumbering2 = new FontFamilyNumbering() { Val = 2 };
FontScheme fontScheme2 = new FontScheme() { Val = FontSchemeValues.Minor };

font2.Append(bold1);
font2.Append(fontSize2);
font2.Append(color2);
font2.Append(fontName2);
font2.Append(fontFamilyNumbering2);
font2.Append(fontScheme2);

fonts1.Append(font1);
fonts1.Append(font2);

Fills fills1 = new Fills() { Count = (UInt32Value)4U };

Fill fill1 = new Fill();
PatternFill patternFill1 = new PatternFill() { PatternType = PatternValues.None };

fill1.Append(patternFill1);

Fill fill2 = new Fill();
PatternFill patternFill2 = new PatternFill() { PatternType = PatternValues.Gray125 };

fill2.Append(patternFill2);

Fill fill3 = new Fill();

PatternFill patternFill3 = new PatternFill() { PatternType = PatternValues.Solid };
ForegroundColor foregroundColor1 = new ForegroundColor() { Theme = (UInt32Value)0U, Tint = -0.499984740745262D };
BackgroundColor backgroundColor1 = new BackgroundColor() { Indexed = (UInt32Value)64U };

patternFill3.Append(foregroundColor1);
patternFill3.Append(backgroundColor1);

fill3.Append(patternFill3);

Fill fill4 = new Fill();

PatternFill patternFill4 = new PatternFill() { PatternType = PatternValues.Solid };
ForegroundColor foregroundColor2 = new ForegroundColor() { Rgb = “FFFF0000” };
BackgroundColor backgroundColor2 = new BackgroundColor() { Indexed = (UInt32Value)64U };

patternFill4.Append(foregroundColor2);
patternFill4.Append(backgroundColor2);

fill4.Append(patternFill4);

fills1.Append(fill1);
fills1.Append(fill2);
fills1.Append(fill3);
fills1.Append(fill4);

Borders borders1 = new Borders() { Count = (UInt32Value)2U };

Border border1 = new Border();
LeftBorder leftBorder1 = new LeftBorder();
RightBorder rightBorder1 = new RightBorder();
TopBorder topBorder1 = new TopBorder();
BottomBorder bottomBorder1 = new BottomBorder();
DiagonalBorder diagonalBorder1 = new DiagonalBorder();

border1.Append(leftBorder1);
border1.Append(rightBorder1);
border1.Append(topBorder1);
border1.Append(bottomBorder1);
border1.Append(diagonalBorder1);

Border border2 = new Border();

LeftBorder leftBorder2 = new LeftBorder() { Style = BorderStyleValues.Thin };
Color color3 = new Color() { Indexed = (UInt32Value)64U };

leftBorder2.Append(color3);

RightBorder rightBorder2 = new RightBorder() { Style = BorderStyleValues.Thin };
Color color4 = new Color() { Indexed = (UInt32Value)64U };

rightBorder2.Append(color4);

TopBorder topBorder2 = new TopBorder() { Style = BorderStyleValues.Thin };
Color color5 = new Color() { Indexed = (UInt32Value)64U };

topBorder2.Append(color5);

BottomBorder bottomBorder2 = new BottomBorder() { Style = BorderStyleValues.Thin };
Color color6 = new Color() { Indexed = (UInt32Value)64U };

bottomBorder2.Append(color6);
DiagonalBorder diagonalBorder2 = new DiagonalBorder();

border2.Append(leftBorder2);
border2.Append(rightBorder2);
border2.Append(topBorder2);
border2.Append(bottomBorder2);
border2.Append(diagonalBorder2);

borders1.Append(border1);
borders1.Append(border2);

CellStyleFormats cellStyleFormats1 = new CellStyleFormats() { Count = (UInt32Value)1U };
CellFormat cellFormat1 = new CellFormat() { NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)0U, FillId = (UInt32Value)0U, BorderId = (UInt32Value)0U };

cellStyleFormats1.Append(cellFormat1);

CellFormats cellFormats1 = new CellFormats() { Count = (UInt32Value)4U };
CellFormat cellFormat2 = new CellFormat() { NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)0U, FillId = (UInt32Value)0U, BorderId = (UInt32Value)0U, FormatId = (UInt32Value)0U };
CellFormat cellFormat3 = new CellFormat() { NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)0U, FillId = (UInt32Value)0U, BorderId = (UInt32Value)1U, FormatId = (UInt32Value)0U, ApplyBorder = true };
CellFormat cellFormat4 = new CellFormat() { NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)1U, FillId = (UInt32Value)2U, BorderId = (UInt32Value)1U, FormatId = (UInt32Value)0U, ApplyFont = true, ApplyFill = true, ApplyBorder = true };
CellFormat cellFormat5 = new CellFormat() { NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)0U, FillId = (UInt32Value)3U, BorderId = (UInt32Value)1U, FormatId = (UInt32Value)0U, ApplyFill = true, ApplyBorder = true };

cellFormats1.Append(cellFormat2);
cellFormats1.Append(cellFormat3);
cellFormats1.Append(cellFormat4);
cellFormats1.Append(cellFormat5);

CellStyles cellStyles1 = new CellStyles() { Count = (UInt32Value)1U };
CellStyle cellStyle1 = new CellStyle() { Name = “Normal”, FormatId = (UInt32Value)0U, BuiltinId = (UInt32Value)0U };

cellStyles1.Append(cellStyle1);
DifferentialFormats differentialFormats1 = new DifferentialFormats() { Count = (UInt32Value)0U };
TableStyles tableStyles1 = new TableStyles() { Count = (UInt32Value)0U, DefaultTableStyle = “TableStyleMedium2”, DefaultPivotStyle = “PivotStyleMedium9” };

StylesheetExtensionList stylesheetExtensionList1 = new StylesheetExtensionList();

StylesheetExtension stylesheetExtension1 = new StylesheetExtension() { Uri = “{EB79DEF2-80B8-43e5-95BD-54CBDDF9020C}” };
stylesheetExtension1.AddNamespaceDeclaration(“x14”, “http://schemas.microsoft.com/office/spreadsheetml/2009/9/main&#8221;);
X14.SlicerStyles slicerStyles1 = new X14.SlicerStyles() { DefaultSlicerStyle = “SlicerStyleLight1” };

stylesheetExtension1.Append(slicerStyles1);

StylesheetExtension stylesheetExtension2 = new StylesheetExtension() { Uri = “{9260A510-F301-46a8-8635-F512D64BE5F5}” };
stylesheetExtension2.AddNamespaceDeclaration(“x15”, “http://schemas.microsoft.com/office/spreadsheetml/2010/11/main&#8221;);
X15.TimelineStyles timelineStyles1 = new X15.TimelineStyles() { DefaultTimelineStyle = “TimeSlicerStyleLight1” };

stylesheetExtension2.Append(timelineStyles1);

stylesheetExtensionList1.Append(stylesheetExtension1);
stylesheetExtensionList1.Append(stylesheetExtension2);

stylesheet1.Append(fonts1);
stylesheet1.Append(fills1);
stylesheet1.Append(borders1);
stylesheet1.Append(cellStyleFormats1);
stylesheet1.Append(cellFormats1);
stylesheet1.Append(cellStyles1);
stylesheet1.Append(differentialFormats1);
stylesheet1.Append(tableStyles1);
stylesheet1.Append(stylesheetExtensionList1);

return stylesheet1;
}

#endregion

#region private void ExcelStyle_InteropExcel(String fileName, String worksheetName)
private void ExcelStyle_InteropExcel(String fileName, String worksheetName)
{
Excel.Application xlApp = null;
Excel.Workbook xlWorkBook = null;
Excel.Sheets xlSheets = null;
Excel.Worksheet xlWorksheet = null;

try
{
FileInfo file = new FileInfo(fileName);
// initialize interop excel objects
xlApp = new Excel.Application();
xlApp.DisplayAlerts = false;
xlApp.Visible = true;
xlWorkBook = xlApp.Workbooks.Open(fileName, ReadOnly: false, Editable: true);
xlSheets = xlWorkBook.Sheets;
xlWorksheet = xlWorkBook.Sheets[worksheetName];

//#################################
//create table
//xlWorksheet.Columns.ClearFormats();
//xlWorksheet.Rows.ClearFormats();
/*get the number of columns and rows in a excel file*/
//int colNumber = xlWorksheet.UsedRange.Columns.Count;
//int rowNumber = xlWorksheet.UsedRange.Rows.Count;
////Add some formatting

//Excel.Range rng1 = xlWorksheet.get_Range((Excel.Range)xlWorksheet.Cells[2, 1], (Excel.Range)xlWorksheet.Cells[1, colNumber]);
//rng1.Font.Bold = true;
//rng1.Font.ColorIndex = 3;
//rng1.Borders.ColorIndex = 1;
xlWorksheet.Columns.AutoFit();

//#################################
xlWorkBook.SaveAs(fileName, Microsoft.Office.Interop.Excel.XlFileFormat.xlOpenXMLWorkbook, System.Reflection.Missing.Value, System.Reflection.Missing.Value, false, false, Excel.XlSaveAsAccessMode.xlNoChange, false, false, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value);
xlWorkBook.Close();
xlApp.Quit();
}
catch (Exception e)
{
throw e;
}
finally
{
/*Release the resources. If you do not release the resources properly.
* File will be still opened by the process and it stops the user from modifying and even from opening*/
Marshal.ReleaseComObject(xlWorksheet);
Marshal.ReleaseComObject(xlSheets);
Marshal.ReleaseComObject(xlWorkBook);
Marshal.ReleaseComObject(xlApp);

xlWorksheet = null;
xlSheets = null;
xlWorkBook = null;
xlApp = null;

GC.WaitForPendingFinalizers();
GC.Collect();
}
}
#endregion

#region public static int ExcelColumnNameToNumber(string input)
public static int ExcelColumnNameToNumber(string input)
{
input = input.ToUpperInvariant();
int sum = 0;
int charExcelNumber;
for (int i = 0; i < input.Length; i++)
{
/*
In excel A=1 and Z=26*. In ASCII value for character A iS 65 and Z is 122. ‘A’ and 65 can be used interchangebly.
If the input is ABC then input[0]=A, input[1]=B, input[3]=C.
Charracter respective Excel column number A=1, B=2, C=3.
The formula to convert: ABC=1*26^2 + 2*26^1 + 3*26^0.
*/
charExcelNumber = ((int)input[i] – ‘A’ + 1);
sum = sum + charExcelNumber * (int)Math.Pow(26, (input.Length – 1 – i));
}
return sum;
}
#endregion

#region private static String ExcelColumnNumberToName(String columnNumber)
private static String ExcelColumnNumberToName(String columnNumber)
{
int colNum;
Int32.TryParse(columnNumber, out colNum);
int dividend = colNum;
string columnName = String.Empty;
int modulo;

while (dividend > 0)
{
/*
In excel A=1 and Z=26*. In ASCII value for character A iS 65 and Z is 122. ‘A’ and 65 can be used interchangebly.
If the input is ABC then input[0]=A, input[1]=B, input[3]=C.
Charracter respective Excel column number A=1, B=2, C=3.
The formula to convert: ABC=1*26^2 + 2*26^1 + 3*26^0.
*/
modulo = (dividend) % 26;
columnName = Convert.ToChar(64 + modulo).ToString() + columnName;
dividend = (int)((dividend – modulo) / 26);
Console.WriteLine(modulo.ToString());
Console.WriteLine(dividend.ToString());
}
return columnName;
}
#endregion
}//class
}//namespace

 

SSIS Debug: The SSIS package dynamically created by script

I had a script task which creates a package dynamically and load data. While I was debugging, I wanted to see the dynamicapackage and open and check everything is working as it should be.

In order to see the package, follow the steps:

  1. On the package open the Variables
  2. On the Variables click on the Grid Options
    1. Check Show system variables
  3. Then find the DebugMode and set to True.

SSIS: Use regex in the Foreach File Enumerator

Regex filter for Foreach Loop

Case
The standard Foreach Loop File Enumerator only has a wildcard filter, but in some cases a regular expression filter would be more useful. See/vote this request at Microsoft Connect.Solution
At the moment I’m working on a custom File Enumerator with regular expression support, but for those who don’t want to wait or don’t want to use custom components… You could achieve the same result with a Script Task.

*update: Now also available as Custom Foreach Enumerator. *

1) Foreach Loop
For this case I will use a standard Foreach File Enumerator that fills a variable FilePath. The filter is *.* so all files will be returned.

Standard foreach loop

2) Variables
I will use two extra variables in this example: RegexFilter (string) for storing the regular expression and PassesRegexFilter (boolean) for indicating whether the filename passes the regular expression filter.

Variables

3) Script Task
Add a Script Task in front of the Data Flow Task and give it a suitable name.

Script Task

4) Add variables
Edit the Script Task and add the FilePath and RegexFilter as ReadOnlyVariables and the PassesRegexFilter as ReadWriteVariable.

Variables

5) The Script
Copy the following script to the Script Task.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// C# Code for filtering filenames with Regex
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;   // Added
using System.IO;                        // Added
namespace ST_02b6595da2274d7182409fb43af929ae.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion
        public void Main()
        {
            // Get the filename from the complete filepath
            String FileName = Path.GetFileName(Dts.Variables["User::FilePath"].Value.ToString());
            // Create a regex object with the pattern from the SSIS variable
            Regex FileFilterRegex = new Regex(Dts.Variables["User::RegexFilter"].Value.ToString());
            
            // Check if it is match and return that value (boolean) to the SSIS variable
            Dts.Variables["User::PassesRegexFilter"].Value = FileFilterRegex.IsMatch(FileName);
            
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}

or with VB.Net

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
' VB.Net Code for filtering filenames with Regex
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Text.RegularExpressions   ' Added
Imports System.IO                        ' Added
<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum
 
    Public Sub Main()
        ' Get the filename from the complete filepath
        Dim FileName As String = Path.GetFileName(Dts.Variables("User::FilePath").Value.ToString())
        ' Create a regex object with the pattern from the SSIS variable
        Dim FileFilterRegex As Regex = New Regex(Dts.Variables("User::RegexFilter").Value.ToString())
        ' Check if it is match and return that value (boolean) to the SSIS variable
        Dts.Variables("User::PassesRegexFilter").Value = FileFilterRegex.IsMatch(FileName)
        Dts.TaskResult = ScriptResults.Success
    End Sub
End Class

6) Precedence Constraints Expression
Connect the Script Task to the Data Flow Task and add an expression that does the real filtering:
@[User::PassesRegexFilter] == true

Precedence Constraint with expression

7) The result
For testing the result, I added an other Script Task below the Data Flow Task that logs all files. My regular expression in this test case: .*[0-9]\.csv$    (all csv files that end with a number in the filename)

The result

Note 1: to gain some performance I should have changed the Foreach Loop File Enumerator wildcard from *.* to *.csv

Note 2: this method could be a bit inefficient if your file collection contains thousands of files and you only need two of them.