Thursday, November 20, 2014

Exercise 7: Part 2 - Network Analysis

Introduction

Part 2 of Exercise 7 dealt with a new method of GIS I have never dealt with before, titled Network Analysis.  In this case network analysis will be used to find the shortest route between active sand mines in Wisconsin and the nearest rail line.  The end result will feature a map representing the routes between sand mines and rail lines and also a table showing the amount of money each route will cost for commuting purposes.  After completing a python script in part 1, the objectives of exercise 7 include:
Load features into the Network Analysis Window
- Calculate a route
- Calculate a closest facility and route
- Build a model to calculate the closest facility route.
- Calculate the cost of sand truck travel on roads by county.


Methods

The first step in starting the Network Analysis was to become familiar with the tool.  In class we tested some of the functions, new route and closest facility to see how the tool worked.  We then added the main components to begin the process, including the mines from part 1, rail terminals provided by our professor, Christina Hupy, and streets from an Esri database.  We had to select the rail terminals to fit our need by selecting ones that are located in Wisconsin (one in Winona, MN), and rail terminals to only shipped using rail lines and not by air.  The next step was to find the closet facility by loading the mines into the incidents and the rail terminals into the facilities.  This will calculate the closest facility by using the streets from Esri, you can see the results in figure 1.

Figure 1: Result after the closest facility tool was run.
Purple squares - mines  black circles - rail terminals  Orange lines - routes


Now the next step of the process is to build a model in ArcMap using model building to find the total cost it takes to drive from the sand mine to a rail terminal for each active sand mine in Wisconsin.  A couple of things we assumed is to come up with the cost is that each sand mine takes 50 truck trips per year to the rail terminal and the truck has to return and also the hypothetical cost per truck mile is 2.2 cents.  These hypothetical numbers were created by our professor because we simply could not calculate the correct numbers.  In model builder the same step was taken by adding the closest facility layer and then the mines as incidents and rail terminals as facilities.  Then after that was solved the next task was to take the routes, project them and add fields to calculate the cost.  The first field added was length, and calculated it to route length divided by 1609, because there are 1609 meters in a mile.  Next we added a cost field, to calculate the cost field we multiplied it by our length * 100 (because of the round trip) and *.022 ( 2.2 center per trip).  This gave us an output table after summarizing by county to find the total cost of sand trip per truck.

Equations used in my field calculator:
Length = route length / 1609
Cost = length * 100 * .022

Figure 2: Completed Model

Results

The completed model gave me mixed results.  As you can see in figure 3 below the map I created to show the routes taken between sand mines and rail terminals.  As you can see, when comparing it to figure 1, there is no route between the northwest mine in Wisconsin and a rail terminal.  I could not figure out how to fix this problem after trying and retrying again to find a solution.  When I looked closely I could see a route going to a non existing mine on the border of Minnesota.  I am not sure why that mines route was going to a mine I had selected out many steps before.  However, the rest of the map shows that many mines are traveling to Chippewa County, Wood County, Trempealeau County, and also to Winona Minnesota on the border of Trempealeau County. 

Figure 3: Map showing the routes between mines and rail terminals

After calculating the cost field I found that Chippewa County experienced the most cost of sand truck travel on roads with 462 dollars.  This was followed by Dunn, and Wood County which totaled 353 and 311 dollars.  Looking at the map above you can see a lot of routes traveling through Eau Claire, Chippewa, and Dunn County which makes sense when comparing it the table below because all three counties rank in the top 3.  Wood county is in the middle of the state where a good number of mines are all traveling in a short distance to one rail terminal.  Shown by the table below Wood county experiences the highest frequency of travels on roads towards rail terminals, which would equal high cost.


Figure 4: Table showing cost of travels on roads per county

Conclusion

Running a network analysis on the sand mines to rail terminals turned out to be a new challenge because this was the first time I have worked with it and also the first time working with model builder in over a year.  Model builder can be a tool of great efficiency, and clarity, but it can also be a challenge because if one step is wrong the tool will not run.  It WILL take multiple times becoming more fluent with both applications as it furthers the knowledge in ArcMap. The results I got could have been much cleaner especially since I missed one mine in the north part of the state.  However, I was happy on how my final map and table turned out, as they both represented my results in the way I wanted. 

Sources:

Esri Geodatabase







Tuesday, November 11, 2014

Exercise 7: Part 1 - Python Script, Network Analysis - Data Preparation

Part 1 prepared the data for network analysis by writing a python script the select the sand mines in Wisconsin to be used.  The criteria we used to select certain mines are: the mines must be active, the mines must not also be a long a rail loading station, and if the mine is within a 1.5 km of a railroad it will be eliminated.  The reason we want to eliminate sand mines near railroads is because we will be doing a network analysis of the sand mines to the nearest rail roads, and estimate the number of trips the trucks will take and the cost of this traffic on local roads in part 2.

The python script resulted in a point feature class containing information of 41 mines that will be used for network analysis in part 2 of exercise 7.

Figure 1: Python Script Completed

Friday, November 7, 2014

Exercise 6: Data Normalization, Geocoding, and Error Assessment

Introduction

The goal of this lab was to develop skills in data normalization, geocoding, and then assessing the errors after the processes have been run.  For this exercise the class was giving a excel spread sheet containing the locations of different sand mines across Wisconsin.  The table included addresses, facility names, operator, city name, county, and more: see figure below 2.  The only problem was that that some of the addresses were not normalized and contained only the PLSS information.  For example: NE SW Sec 2, 7N, 3W, our task was to fix these addresses and other information so the data could be used in ArcMap for geocoding.  The last step of the the exercise involved querying out the mines assigned to us and assessing the location of the mines using ArcMap. 

Methods

As I stated above each student was given an excel sheet containing information regarding sand mines in Wisconsin, figure .  The task involving the spreadsheet was to normalize the data so the correct faculty address, community (city), zip code, state, and the mine unique ID field, were correctly entered into a new personal excel sheet for future use.  Each student was responsible for at least 16 mines, I ended up normalizing 22 mines.  Dr. Christina Hupy, course instructor, developed a system where each student was assigned a number and each number was assigned to a different mine.  Therefore four students attempted to normalize each mine to test the accuracy of our geocoding, which can be seen in the results tab below.  To find the correct address for each mine I used google maps, Google search, and the PLSS shape file our geospatial technician, Martin Goetll provided for us. Finding the correct addresses was a large task, but searching the web by using some of the addresses provided and the facility name made it easier to find.  In or order to geocode it is critical to normalize the data.  If you were to try and geocode the mines before normalizing it would not work, therefore normalizing the table with the correct information is a critical step in the process.

After normalizing the data the next step is to geocode the mines in ArcMap using the geocode tool.  This involves signing into Esri's ArGIS online, adding the excel sheet of the mines normalized, filling out the correct parameters, and finally running the tool.  After geocoding the address a table like figure 1 will appear on the screen.  In my case all 22 of my mines matched with a score of 90 or higher, therefore I did not have to manually match any of my mines.  In fact all but six had a score of 100 showing that I did a good job normalizing my table for geocoding.

Figure 1: Results showing how many addresses matched after geocoding

After geocoding the mines the next step is to merge all of my classmates mines into one feature class, to then query out the mines that I used to compare accuracy.  Querying out all of the mines that were the same as mine to test the accuracy was a somewhat difficult task because some classmates did not correctly normalize the data.  All of the mine id's were not located in the same field therefore searching through the attribute table to find all the mine ids that matched mine was necessary.  After querying out the mines that matched mine the process is complete and the point distance too was run to give a result of the accuracy between the mines.


Results

Figure 2: Mine table before they were normalized  

Figure 3: Part of my normalized table
Figure 1 shows the table provided for us giving the information of the mines and figure 3 shows my part of my normalized table.  Figure 4 below shows the location of my mines in purple and then the the queried/matched mines in black.  As you can see some of my mines and classmates mines matched up perfectly and others were not matched but decently close.  One thing to note is that there can be the same mine twice or three times in black as each mine was normalized four times by our class.  Resulting in more black squares than purple triangles. 

Figure 4: Map showing my mines purple and queried mines in black

The table below shows my mines (Input) and the queried mines (near) and the distance, in meters, between them.  As you can see about two thirds of my mines matched very accurately with the queried mines but some of them did not match well at all.

Figure 5: Table showing the distance in meters between my mines and
the queried mines


Discussion

I experience many errors while working through this exercise, but I think this assignment was meant to produce errors and challenge us in handling these errors.  Using the normalized tables of my classmates caused error when querying out the mines I needed.  Some people from the class did not use the correct mine id field when normalizing making it difficult to find the mines I needed.  Also another source of error came when trying to find the correct address, zip code, and city of the mine.  Because some of the mines only contained PLSS information finding the correct address of mine became difficult.  Also because some of the mines do not even exist yet and are proposed or inactive finding an existing address was tough.  After running the point distance tool to check for accuracy of my mines and the classmates mines table errors were common.  In figure 5, from input 2 down you can see the accuracy is not precise at all.  This was because of a data entry error when normalizing the tables.  Either that is on me or one of classmates, I did notice for one of my entries I did not put an s in front of the address causing it to not be in the same place as my classmates who did include an s.  Not including the s caused the geocode process to not match it accurately to the right address therefore causing an error. 

How can we know which points are actually correct and which ones are not? We can tell which mines are accurate by looking at the match address field after geocoding.  If the matched address field contains the same address I have and my classmate has then that is the correct location of the mine.  You can also look at the distance field, figure 5, for support and if it is very close to zero then it is the correct location.  We can also tell it is a correct location if the score is 100 and the status is M, which means matched.  By looking at these three things in both my mines table and my classmates mines table I can tell which mines contain the actual correct location and which mines do not. 

Conclusion

This exercise developed skills in normalizing tables, geocoding, and then assessing errors.  This helped developed new skills when dealing with the location of a building.  It shows how accurate and clean you need to be when working with geocoding in ArcMap. I was pretty happy with the results I got, I think I normalized my table well and the accuracy of my mines compared to my classmates was average.  Doing another assignment like this I am sure we will be more efficient and smooth when normalizing the tables and finding the correct address. 

Sources:
Mines provided by Christina Hupy
US census bureau