Python iteration loop problem

This is the place for queries that don't fit in any of the other categories.

Python iteration loop problem

Postby KatJ9 » Tue Jun 04, 2013 3:25 pm

Hello! I'm writing a python script that will import all the excel files in a folder and write them into a SQL Server table. The script runs if I just point it to a single excel file, but i'm stumped at setting up an iteration loop that will read through all the excel files in a folder. I need to insert some type of loop inbetween the "if file_to_import.endswith('.XLS'):" & the "column_count=10" lines. Any ideas or suggestions would be very much appreciated!

Code: Select all
# Import arcpy module
from xlrd import open_workbook ,cellname
import arcpy
import pyodbc as p
import os

# Database Connection Info
server = "Server"
database = "DB"
connStr = ('DRIVER={SQL Server Native Client 10.0};SERVER=' + server + ';DATABASE=' + database + ';' + 'Trusted_Connection=yes')

# Assign path to Excel files
folder_to_import = '\\\\Location\\DATA'
l_files_to_import = os.listdir(folder_to_import)
for file_to_import in l_files_to_import:
    if file_to_import.endswith('.XLS'):

            column_count=10

# Open entire workbook
book = open_workbook(file_to_import)

# Use first sheet
sheet = book.sheet_by_index(0)

# Open connection to SQL Server Table
conn = p.connect(connStr)

# Get cursor
cursor = conn.cursor()

# Assign the query string without values once, outside the loop
query = "INSERT INTO HED_EMPLOYEE_DATA (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"

# Iterate through each row
for row_index in range(1, sheet.nrows):
    row_num          = row_index
    Company          = sheet.cell(row_index,0).value
    Contact          = sheet.cell(row_index,1).value
    Email            = sheet.cell(row_index,2).value
    Name             = sheet.cell(row_index,3).value
    Address          = sheet.cell(row_index,4).value
    City             = sheet.cell(row_index,5).value
    CentralCities    = sheet.cell(row_index,6).value
    EnterpriseZones  = sheet.cell(row_index,7).value
    NEZ              = sheet.cell(row_index,8).value
    CDBG             = sheet.cell(row_index,9).value
    values = (Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG)
    cursor.execute(query, values)

# Close cursor
cursor.close()

# Commit transaction
conn.commit()

# Close SQL server connection
conn.close()
KatJ9
 
Posts: 5
Joined: Tue Jun 04, 2013 3:19 pm

Re: Python iteration loop problem

Postby setrofim » Tue Jun 04, 2013 3:46 pm

You already have a loop that iterates over all files. Just move the open_workbook() line and everything after it into the for loop (well, you probably don't want be opening and closing the DB connection on each iteration, so do that before the loop).
setrofim
 
Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm

Re: Python iteration loop problem

Postby KatJ9 » Tue Jun 04, 2013 3:56 pm

Hi setrofim, thank you for the reply! I tried that and now i'm getting the error "IOError: [Errno 2] No such file or directory: 'Report_Test.XLS'" I checked in the directory and Report_Test.XLS is one of the excel files in there so i'm not sure what's going wrong :(
KatJ9
 
Posts: 5
Joined: Tue Jun 04, 2013 3:19 pm

Re: Python iteration loop problem

Postby micseydel » Tue Jun 04, 2013 4:34 pm

You need to use os.path.join(folder_to_import, file_to_import), since when you open the file you're saying the name of the file but not where that file is.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1497
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Python iteration loop problem

Postby KatJ9 » Tue Jun 04, 2013 4:49 pm

Thank you for the tip micseydel!
I'm now getting the error "UnicodeDecodeError: 'utf16' codec can't decode byte 0x20 in position 108: truncated data"
That part of my code looks like this now, perhaps I put inserted it into the wrong place?
I'd actually like for it to delete each file after it writes the data into the SQL table, maybe that will fix this issue?
Thank you for your patience, i'm still a newb :)

Code: Select all
# Assign path to Excel files
folder_to_import = '\\\\location\\'
l_files_to_import = os.listdir(folder_to_import)
for file_to_import in l_files_to_import:
    if file_to_import.endswith('.XLS'):
            print"Assign path to excel files complete"
            column_count=10
            print"Column count 10 complete"
            book = open_workbook(os.path.join(folder_to_import, file_to_import))
            print"Open workbook complete!"
            sheet = book.sheet_by_index(0)
KatJ9
 
Posts: 5
Joined: Tue Jun 04, 2013 3:19 pm

Re: Python iteration loop problem

Postby micseydel » Tue Jun 04, 2013 4:56 pm

You should always give the full traceback, not just the error message. For example, if I do
Code: Select all
print x

without defining x, I get
Code: Select all
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    print x
NameError: name 'x' is not defined

This includes the line number, the line, and if there are nested function calls it includes that information as well.
Join the #python-forum IRC channel on irc.freenode.net!

Please do not PM members regarding questions which are meant to be discussed publicly. The point of the forum is so that others can benefit from it. We don't want to help you over PMs or emails.
User avatar
micseydel
 
Posts: 1497
Joined: Tue Feb 12, 2013 2:18 am
Location: Mountain View, CA

Re: Python iteration loop problem

Postby KatJ9 » Tue Jun 04, 2013 5:13 pm

This was the full error message I got:

Traceback (most recent call last):
File "C:\folder\Projects\Import.py", line 52, in <module>
book = open_workbook(os.path.join(folder_to_import, file_to_import))
File "C:\Python26\ArcGIS10.0\lib\xlrd\__init__.py", line 435, in open_workbook
ragged_rows=ragged_rows,
File "C:\Python26\ArcGIS10.0\lib\xlrd\book.py", line 116, in open_workbook_xls
bk.parse_globals()
File "C:\Python26\ArcGIS10.0\lib\xlrd\book.py", line 1208, in parse_globals
self.handle_writeaccess(data)
File "C:\Python26\ArcGIS10.0\lib\xlrd\book.py", line 1173, in handle_writeaccess
strg = unpack_unicode(data, 0, lenlen=2)
File "C:\Python26\ArcGIS10.0\lib\xlrd\biffh.py", line 303, in unpack_unicode
strg = unicode(rawstrg, 'utf_16_le')
File "C:\Python26\ArcGIS10.0\lib\encodings\utf_16_le.py", line 16, in decode
return codecs.utf_16_le_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0x20 in position 108: truncated data
KatJ9
 
Posts: 5
Joined: Tue Jun 04, 2013 3:19 pm

Re: Python iteration loop problem

Postby MichelFJM » Wed Jun 05, 2013 2:06 pm

Hello

It seems to me that open_workbook has a problem with a special character included in the excel file (é, €, ...).
You said it was working with the first file, without the loop. Are some files ok , some not ?
MichelFJM
 
Posts: 19
Joined: Wed May 22, 2013 1:41 pm

Re: Python iteration loop problem

Postby KatJ9 » Wed Jun 05, 2013 2:41 pm

They all have normal letters & numbers, however there is the @ character in there for email addresses. Could that be it?
KatJ9
 
Posts: 5
Joined: Tue Jun 04, 2013 3:19 pm

Re: Python iteration loop problem

Postby setrofim » Wed Jun 05, 2013 3:15 pm

No the @ is an ASCII character, that won't be the problem. You can specify a different encoding when you open the workbook. Try
Code: Select all
book = open_workbook(os.path.join(folder_to_import, file_to_import), encoding_override='cp1252')

If "cp1252" doesn't work, "utf-8" is also worth trying.
setrofim
 
Posts: 288
Joined: Mon Mar 04, 2013 7:52 pm


Return to General Coding Help

Who is online

Users browsing this forum: No registered users and 2 guests