Importing CSV or Excel Files
This tutorial dicusses the process of importing data in a CSV (comma separated) or from an MS Excel (.xls) file. While the process for importing these two file types is similar, there are some differences, and some advantages and disavantages to each.
After completing this tutorial, you will be able to:
- Identify the proper layout for a spreadsheet to be converted to CSV for import.
- Convert spreadsheets to a CSV in Excel
- Import a simple CSV file with production data into PHDWin
- Identify the advantages of CSV over Excel files for import
- Identify the advantages of Excel files over CSV for import
- Prepare Excel Files for importing into PHDWin
- Import simple Excel files into PHDWin
When importing a CSV file into PHDWin it is important to have the proper structure to the document. The following must be true for the CSV file to import properly:
- The data must start in cell A1
- Row 1 must conatin header data (titles) for the columns
- Column A must contain the unique identifier that will be used to determine duplicates in PHDWin
- There cannot be any skipped columns or rows in the data
- The data must exist on one sheet (no multi-sheet workbooks)
Saving as CSV
Excel spreadsheets can easily be saved as a CSV file. Whether you open an existing spreadsheet, or create one from scratch, to save the file in the CSV format for importing into PHDWin:
- In Excel go to Files - Save As...
- Choose CSV (Comma delimited) in the Save As Type drop down box
- Click Save
Excel will prompt you with one, or both of the following warnings when saving as a CSV.
This warning states that CSV files do not support multi-sheet workbooks. Therefore we can only save the active sheet into the CSV file. Later in this tutorial we will see that to handle multi-sheet workbooks we will need to use the Excel import.
This prompt is warning the user that there may be features that are used in the spreadsheet that are not supported in CSV formats. Since CSV files are text based, and very simplistic, even things like fonts and colors will be lost. This does not affect the data in any way, and in fact, it is this simplicity of the file format that makes them predictable, and therefore useful for imports.
Importing in PHDWinThe Add/Import Wizard is used to create new cases from scratch, or to import data from any external source. For importing purposes, the user selects the Import option button at the top of the window as pictured below.
The next step is to locate the file that is to be imported. Clicking the [...] browse button will open a window that allows you to browse your computer and locate the file to be imported.
Once the file has been selected you may use the remaining options on the page to add additional information to the incoming cases. The options allow you to apply an economic scenario, specify a default working and revenue interest, set the default country of origin, and even whether PHDWin is to apply an autofit projection to the incoming data.
Note: The options here will not overwrite the data to be imported, nor will they affect any cases in the database if the import is being used to update existing cases.
Importing in PHDWin
Whenever importing into a project file that already has cases existing in it, this window will open, allowing you to specify what ID code will be used to determine a matching case, as well as how to handle the overwriting and appending of data for cases that do find a match.
The top section of this window is used to select the ID Code that is to be used to determine whether the incoming case already exists in PHDWin, or if it should be treated as a new case. This selection is based on the ID Code that is in Column A of the CSV file. So if the CSV file contains the API number, for instance, you would highlight API in this list.
NOTE: When the code is unknown, or non-existent in PHDWin, the imported data will create new cases. In those instances where that is the intention, and the incoming ID code is not intended to match an existing code choose "Retrieval Code" from the list.
The bottom section of this window is used to determine what data takes precedence if there is a match found for the incoming data in the existing PHDWin database. In the case of conflict the user can choose to overwrite the data that exists in the PHDWin file, or to ignore the incoming data and leave the data in the PHDWin file intact.
Creating Import Mappings
Once you have selected the file to be imported, and the ID code that is to determine a matching case between the import file and
the PHDWin file, the next step is to create a mapping that will tell PHDWin what each column in the CSV file represents. This
map tells PHDWin what fields to store the incoming data in.
The window above allows the user to select from an existing mapping, or to create a new one. Mappings are stored off so that if the user needs to import files with the same layout in the future, they do not have to recreate the mapping each time.
If you have not imported a CSV file with an identical layout previously, you will click the New button to create a new layout from scratch. PHDWin will prompt the user for a name for the CSV. While it is good to be descriptive, you should also keep the name relatively generic. In other words, try to avoid naming it by the content of the CSV like a field name, but rather by the structure of the CSV.
Once the layout has been named, you will need to specify the File Type that the import is related to. In this case, the CSV radio button is selected. Only after you have done these two steps will the Master Table and Data Fields options become active.
The Master Table button is used to select the main table and linking field for Excel and ODBC types of imports. It is not necessary for a CSV or other Text imports.
Mapping Data FieldsWhen the Data Fields button is clicked on the previous screen, the window pictured below opens, allowing the user to map the incoming fields from the CSV file to an existing field in the PHDWin file.
This window is essentially broken down into two halves. The left portion of the window gives us a window into the CSV file. It contains several columns that help us identify the incoming data, to help us better determine what to map it to in PHDWin.
- Column displays the column headers from Row 1 of the CSV file.
- Sort is used to prioritize groupings of data. This is not used for CSV imports.
- PHDWin Database Field indicates the field in PHDWin that this item has been mapped to.
- Data Value shows row 2 from the CSV file to give the user a look at the actual data to make sure that the data is lined up correctly to the headers and to ensure that the correct formats are being used.
The right hand side of this window display all of the fields in PHDWin into which data can be imported.
The PHDWin data fields are broken down into categories to make them easier to find. The category is displayed in the drop-down menu in the top right corner of the window. Commonly used categories include "Lease Titles" which contains most header information such as Case Name, Class and Category, Major Phase, ID codes etc. Prod Volumes, which contains monthly production data fields, and Test Data which contains daily test data fields.
All of the data in the CSV file shown in the example relates to monthly production, thus all of the fields would be found under that category.
The goal for the user on this window is to match each of the CSV fields on the left to a PHDWin data field on the right by dragging and dropping the PHDWin Data field to the appropriate row on the left. Rows on the left that are left blank, will simply be ignored by the import, and will not be brought in to the PHDWin database.
Handling New Cases
Once the Finish button is clicked, PHDWin begins importing the data from the file. If PHDWin finds an ID code from the file that does not have a matching code in the existing PHDWin file, the following window will open asking the user how to handle this case.
- Yes will create a new case in PHDWin out of the incoming data.
- Yes to All will automatically create a new case for unmatched IDs without prompting the user each time.
- No will skip any data associated with this ID and the case info will not be imported.
- No to All will skip this, and any other unmatched IDs without prompting the user each time.
Once PHDWin has brought in the data from the file, a window will open informing the user of the number of cases that had data imported to them. This count includes both newly created cases, as well as existing cases that were affected.