GIS Coursework

Showing posts with label GIS5103 - GIS Programming. Show all posts

Friday, August 7, 2015

Module 11 - Sharing Tools

The final lab of the semester was short and sweet - we created parameter help messages associated with a script tool, then embedded the main python script within the tool in order make it easier to distribute. To keep our scripts 'safe' we password protected the embedded script - which means that only those who know the password can see or modify the python script.

The script tool that we modified does the following: creates a series of random points within an extent as defined by the input feature class, then creates a buffer around the randomly created points. The random points are offset from each other by a certain distance, as defined by the tool end user. A screenshot of the tool dialog box and output results is shown below.

Tool dialog box with custom parameter help messages on the left, and the tool results on the right.

Parting Thoughts...

This course has proved to be very useful... I'm still in shock over being able to actually create a custom script tool that actually works! Learning the basics of how to code in Python has been very cool and I definitely think the lessons 'took'. Even if I never get a chance to write another custom tool in my life, just by being able to understand what all those code examples mean on the ArcGIS Help page will come in handy. In a strange way I think I have a better understanding of how most tools run now that I can get the gist of their associated code.

Sunday, July 26, 2015

Discussion Post 2: Using the Geographical Collocates Tool – a custom tool for text based spatial analysis

The article I selected for this discussion topic is entitled Automatically Analyzing Large Texts in a GIS Environment: The Registrar General's Reports and Cholera in the 19^th Century. The researchers use techniques developed within the fields of Natural Language Processing (a sub-field within computer science that focuses on automating human language analysis via computer) and Corpus Linguistics (a sub-field within linguistics that examines large amounts of texts for linguistic relationships) to help shape the way their custom tool, called Geographical Collocates Tool, analyzes large bodies of digitized texts. The tool identifies place-names associated with a specific topic – in the paper example the focus was on cholera, diarrhea, and dysentery as documented within the General Registrar reports for the years 1840 – 1880.

The tool works as follows: one first defines what words or phrases the tool should look for, and then defines how 'far' the tool needs to look within the text to find an associated place-name. This can be as far away as an entire paragraph, within the same sentence, or only up to five words away (just to give a few examples as described within the article). The results of this tool is a database of word associations, locations within the text document (for more in-depth human review), place-names, and locations in lat./long. for the associated place names. This of course requires a bit of geoparsing before running the tool – as was the case for the examined dataset within the article. The next step involves running a series of fairly complex statistical analyses on the database results – which requires a more in-depth discussion than what I'm prepared to give here.

The main take-away for me was the use of collocation to group the results of their tool, and the idea that while not every place-name/word proximity association is meaningful if the pattern is repeated often enough it becomes statistically significant (p.300). The overall analysis results are also fascinating –outbreaks were numerous in the 1840s but dropped off by the 1870s (showing the discovery of the link between sanitation and diseases, and the implementation of better public sanitation). The analysis also showed a bit of a policy bias – London had the greatest public and governmental focus owing to the raw counts of deaths related to the outbreaks but other cities, particularly Methyr Tydfil in Wales, had the highest mortality rates in relation to their overall population. The results also showed a spike in 1868 of the disease – which is because a disease history report covering the years 1831 to 1868 had been published that year (and so correctly showed up as a statistically significant spike within the analysis).

Essentially, this article highlights an effective and a relatively accurate way to analyze large amounts of text (without spending years doing so), to find and analyze spatial patterns based on specific topics, and a completely new way to approach historic documents and to frame associated research questions.

Reference:

Murrieta-Flores, P., Baron, A., Gregory, I., Hardie, A., and Rayson, P. (2015) Automatically Analyzing Large Texts in a GIS Environment: The Registrar General's Reports and Cholera in the 19^th Century. Transactions in GIS, 19(2): 296-320.

Module 10 - Creating Custom Tools

This week we created a custom tool using a Python based script. The script simply clips several files at once to a single defined extent. Incidentally there are several differences between a script and a tool, the most notable being:

stand-alone scripts need to work with hard-coded file paths and variables
stand alone scripts tend to be run within an IDE
tools do not need to have hard-coded file paths and variables - in fact, that is usually not preferred
the use of tools within ArcGIS does not require any knowledge of Python whatsoever
while both scripts and tools can be shared, tools are more easily shared because they are not generally tied to any specific file paths

The first step in creating a custom tool is in making sure that there is a toolbox created within which to put it. Next one adds a script to the toolbox - ArcGIS provides an add script wizard where the majority of the script-to-tool conversion takes place. Parameters for the tool can be defined at this time - for the assignment these were the input and output file paths, and also the clip boundary and input features. We had also set up specific file paths for the input and output parameters - but it would be just as easy to create the tool without these being completely defined. An example of the final tool is shown below.

The multi-clip tool opening screen.

In order to finalize the tool the script needs to be modified so that it will work within a tool environment. Our original script had referenced specific datasets and file paths and so needed to be a bit more flexible. This was accomplished by using the arcpy.GetParameters() function to replace specific file paths and variable dataset locations. The arcpy.AddMessage() function was used to convert all Python interactive window print statements to printed text within the geoprocessing results window, as shown in the screenshot of the tool results below.

Screenshot of the geoprocessing results window after running the multi-clip tool.

A fun little feature within ArcGIS is that one can change the code and the tool within ArcCatalog - the Python IDE does not need to be opened! A downside of this is that the code is instead edited within notepad, so any mistakes in syntax are not as easily caught. For this reason it seems to me that code edits within ArcCatalog should be kept to a minimum - meaning that it's best to have a complete code (or nearly so) when first creating a custom tool.

Sunday, July 19, 2015

Module 9 - Working with Rasters

This week's lab had us complete a basic suitability analysis using raster data - all from a Python script! Enabling Spatial Analyst within our code allowed us to reclassify a land cover data set and modify an elevation raster to show a specific slope and aspect range. The raster results were then combined to form a single raster file with Boolean values - a '0' indicated areas of the raster that did not meet the stated requirements, and a '1' indicated that the areas did meet all of the stated requirements. An image of the final result is shown below.

Final script results showing the combined raster files.

Completing the code this week seemed pretty straightforward - the only problems I ran into concerned missing brackets or misspelled words. A pseudocode example of creating the raster files is as follows:

START
Define the land cover variable and its Remap Value
Define the output of the Remap Value land cover variable using Reclassify
          Create an elevation raster object
          Create the slope variable using the .Slope function
          Separate out all slope values < 20
          Separate out all slope values > 5
          Create the aspect variable using the .Aspect function
          Separate out all aspect values < 270
          Separate out all aspect values > 150
          Combine all of the slope, aspect, and land cover rasters into a single raster
          Save the final combined raster
END

Saturday, July 18, 2015

Module 8 - Working with Geometries

This week's lab used Python scripting to copy specific data variables from an existing polyline shapefile to a text file.

Screenshot of a portion of the printed text file.

The screenshot above shows the following variables extracted from a polyline shapefile (called "rivers.shp"): OID number, Vertex ID number, X coordinates, Y coordinates, and the part name. There are multiple values with the same OID and part name - that is because the polyline file contains an array of points that are connected together to form the polyline. So each part/OID number represents a vertex along each grouped polyline.

The script was short, but the syntax was a bit tricky to get the necessary variables to print correctly. In the end it was a matter of how many parenthesis were being used and recalling which variable was associated with each row called. A quick run down of the pseudocode used is as follows:

Start

Import arcpy

Set workspace environments

Define workspace file path

Enable overwriteOutput

Define "fc" as "rivers.shp"

Define "rivers.shp" variables to extract using SearchCursor

Cursor in SearchCursor will look for data in the OID, SHAPE, and NAME fields

Create text file

Populate text file with rivers.shp data

Define a vertex id variable

For Search Cursor results, define the row:

For the Search Cursor row results.getPart():

print the OID, vertexid, (x, y coordinates), and NAME fields

write the OID, vertexid, (x, y coordinates), and NAME fields to the text file

Close the text file

Close the Search Cursor row

Close the Search Cursor

END

Sunday, July 12, 2015

Module 7 - Explore/Manipulate Spatial Data

This week's lab was one of the tougher ones... probably because the majority of the code was written without much code-building help from ArcGIS tool syntax. The goal for this lab was to create a new geodatabase, populate said geodatabase with pre-existing shapefile data, then create and populate a dictionary of County Seat cities in New Mexico (using the data that was copied into the new geodatabase).

Screenshot (in two parts) of the Module 7 script results.

Rough pseudocode generally replicating the above is as follows:
Start
   Set workspace environment
   Create new geodatabase
   Print shapefile feature class list
   Copy shapefile data to the new geodatabase
   Set SearchCursor to identify each city that is a county seat
     Create a new dictionary
     Populate the new dictionary with all cities that are county seats
Key = City Name, Value = City Population
     Print the county seat dictionary
End

Just after the point when I copied the shapefile data to the new geodatabase is where things started to get a bit hairy... While there were problems with my original code attempts using the SearchCursor method, the real issues lay with the fact that my geodatabase didn't initially populate! This was a little something I figured out after several attempts to run the code - once I was locked out of my dataset (and decided to delete the geodatabase in ArcCatalog to start over) I found that I couldn't expect my code to search a cities feature class if it never actually existed in my geodatabase in the first place! The solution was to use the arcpy.ClearEnvironment function... after that I finally had a working code.

My success was short-lived - getting the dictionary to populate was also not so easy for me. The hint in the lab was that the code need to iterate within a for loop, and an example of how to set up the iteration was even provided. It sounded simple (and looking back on it, it IS simple) but this single step took me hours to get through. To summarize my problems and solution mini-drama:

The original SQL query used from my previous SearchCursor step had been deleted after that step... so I needed to re-write that bit of code.
The for loop needed its iteration cycle set up... first by defining my variables, then by plugging these variables into the iteration code suggestion.
The number used to define the for loop variables needed to match the placement of my variable within my SearchCursor code; for example, if my city was listed first within my SearchCursor then it needed to be listed as [0] within my for loop code.

Such small details caused a world of grief... and learning! Hopefully I won't be forgetting these lessons anytime soon.

Friday, June 26, 2015

Module 6 - Geoprocessing with Python

This week's lab focused on running various geoprocessing tools within PythonWin - not necessarily within ArcMap itself. Setting geoprocessing environments and checking licenses were also covered within the lecture materials.

Process results from Module 6 script.

The above screenshot shows the results of this week's lab assignment. The goal was to write code to run three different geoprocessing tools (the Add XY tool, Buffer tool, and the Dissolve tool) and to print the results messages after each tool runs. The bit on the end about Module 6 being a success is my own personal variation.

At first this lab seemed daunting, but using the ArcGIS Help files was really the key to completing this lab. The help files contain useful scripting examples and notes, which made writing the code to run the geoprocessing tools very easy. The only real snags were directing the results to correct file paths and making sure to use the tool results from previous processes as inputs within the new process - but with a little thought things worked out well in the end.

Friday, June 19, 2015

Module 5 - Geoprocessing in ArcGIS

This week we used Model Builder to create a simple model, and then converted that model to Python script.

Model result - the removal of certain soils from within a basin.

The above screenshot shows the results of the model that was created for lab. Essentially the model process was to clip a soils layer to the extent of a basin layer, then to select a certain soil type from within the clipped area (specifically soils designated as 'Not prime farmland'), then to erase the selected soil type from clipped area. The results shown above is the clipped soil extent without the 'Not prime farmland' areas.

One could do all of these steps individually... or they could combine them into a series of tasks using Model Builder. Converting the model steps to Python was super easy... and I found that I understood the Python coding process a bit better as a result. By doing a few simple fixes the Python code was able to be run outside of GIS - so that will be super handy for the final project.

Saturday, June 13, 2015

Module 4 - Debugging & Error Handling

This week's lab focused on debugging scripts... on purpose, not just those errors we accidentally created because we don't know any better! The error handling techniques learned in class were applied to scripts with both syntax and exception errors.


Script 1 - final results of a corrected script

The first script, shown above, contained basic input errors - such as not using the exact syntax for a defined variable. Most of the errors I was able to spot right away, but I used the PythonWin 'Step' debugging tool to highlight where these issues actually were.


Script 2 - final results of a corrected script

The second script, shown above, was a little trickier. This time I employed a mix of running the script with and without using the Step debugging tool - eventually I was able to fix it. The mistakes were similar to what had gone on in Script 1... except this time there were more of them so simply scanning the script wasn't very effective.

Script 3 - result show how exceptions can be caught!

The final script was not corrected - exactly. The purpose here was to use the "try-except" statement method of error handling... this allowed the script to run yet also to tell you in greater detail what the real issues are. I decided to use a "try-except" statement for what I perceived to be each block of code - a sign that I'm finally starting to read and understand coded scripts! My real issues with the final script were with improper syntax - nothing was going to happen until I had standardized my indentation. Once I figured that out I was able to produce the results that you see above.

Friday, June 12, 2015

Discussion Post 1

Our first discussion in GIS Programming is about the uses of GIS in the real world. To facilitate our discussion, we were to find an article showing just about any interesting application of GIS - not necessarily limited to Python programming. My article focused on the use least-cost path analysis to assist in determining the route of Hernando de Soto's 1540 entrada:
http://saa.publisher.ingentaconnect.com/content/saa/aa/2015/00000080/00000001/art00003

The researchers wanted the optimal path to take into account the size of the party traveling that path. As pointed out in their article, optimal paths use slope information to find the best route and so the default best route is almost always along a flat area. Within the area under consideration that would mean that the best route will tend to be along canyon bottoms - which isn't appropriate when trying to determine where a large party of over 100 people, plus livestock and supplies, would be traveling. To fix this they 'degraded' the spatial resolution of a DEM (specifically a DEM from the 2000 U. S. Shuttle Endeavor with a 90 meter resolution) - this did not affect the accuracy of the DEM but did eliminate the default canyon bottom choice (p.50). In the end their preferred route was a wide northerly path that was a bit longer distance-wise, but it does match with accounts from the travelers themselves, as well as archaeological and linguistic place-name evidence.

Overall I liked that the article used GIS as one tool (out of many) to answer an unresolved question, and also that the traditional parameters of a cost-path analysis were changed to fit a specific project need. The article shows just one of the many ways GIS analysis can be used to answer spatially related research questions.

Reference:
Sampeck, Kathryn, and Jonathan Thayn, Howard H. Earnest, Jr.
2015 Geographic Information System Modeling of de Soto's Route from Joara to Chiaha: Archaeology and Anthropology of Southeastern Road Networks in the Sixteenth Century. American Antiquity (80) 1: 46 - 66.