SSIS/Excel 8-Row-Sampling/IMEX

quote-facts-are-stubborn-things_John Adams
Reno_Corporate_Offices_External_Ground_Flag

Importing Excel files with mixed data types: Why some values are converted to NULL and how to get around that problem

  • Sample data to illustrate what happens with mixed data types during import:

IMEX_1

Table results after importing file:

  • Values turned to NULL due to Majority Rule on Row Sampling that determines data type:

IMEX_2

Row sampling based off registry setting “TypeGuessRows=#”

  • TypeGuessRows # defaults to 8 with a max of 16
  • Setting TypeGuessRows = 0 will scan all rows (per this article)
  • Altering registry is obviously risky and rarely permitted (particularly on servers)
    • TypeGuessRows = 0 would not be a good option in general
    • Even if you set the max to 16 — any values that follow and don’t conform to the data type defined in the first 16 rows will all be imported as NULL

Enter IMEX

SSIS ConnectionString property: Extended property setting of IMEX

  • IMEX modes: 1 is Import , 0 is Export, 2 is Linked

  • All illustrations are in Import Mode

IMEX_3
  • IMEX bypasses Majority Rule so that all values become text — as long as there is some mix of types within the row sampling:

IMEX_4

But if IMEX only applies to the Row Sampling registry setting (TypeGuessRows = #) — what do you do about the rows that follow (9 and beyond below)?

  • Results in the following whether IMEX is used or not

IMEX_6

IMEX Workaround for the above

  • Dummy row

IMEX_7

Additional Material