| BackgroundThis site is hosted by the Watersheds and Water Quality Program, located in the Agricultural and Biological Engineering Building at the University of Idaho. The primary goals of this system are to distribute stream monitoring data and weather collected from within the Paradise Creek watershed. The datasets are available for download in raw text format, or can be actively filtered and layered for viewing or custom downloads. The secondary goal of this system is to provide a scalable data management framework so that datasets from other watersheds can be added without the need for structural revisions. |
||
|
||
| Data Source DevelopmentThe sources of the data available here are derived from the raw data samples taken from automated field monitoring stations or manually collected samples. Raw data collected by staff and students of the UI Watersheds and Water Quality Program, are imported into a spreadsheet. The data are then analyzed for discrepancies, missing data points, and other issues and corrected to ensure that the results are consistent. Once the datasets have been 'cleaned', they are exported to TAB delimited text files. These files typically contain several columns of varying data types, such as stream flow, turbidity, temperature, PH, etc. All files of this type also contain a date-time stamp. This data management system uses these text files as sources for the primary database. The diagram below illustrates how these data sources are handled by this system: |
||
ASCII Source Files
At the heart of the system lies the collection of ASCII data files. These files are used to populate the database. Each file contains data from one data type for one (water) year. Limiting the content of each file to one data type allows for a modular data management design. The benefits of this include
|
||
|
||
| The m_date column is a standard date format (yyyy-mm-dd) suitable for storage into the MySQL database. The m_time column indicates the time of day the sample applies to. The ordinal column is the numeric day of the year, and the m_value column provides the value of the data type at the given sample date and time. There are cases where actual data values are not available (e.g. sensors were offline). In such cases, data values were estimated or interpolated and an optional m_code column would be present to indicated how or why this was done. View current ASCII file list The above list shows the current collection of ASCII source files. You will note that the filenames are coded to indicate information about the file in the following manner: Example: 'CON-20-NTU_WY01.txt' is in the form of |
||
| Data StorageThe MySQL database is defined as separate data tables for each data type. This method allows queries to be performed across tables so that comparisons of different data types can be made (for any given sample time and date). Fast inserts - Local testing has shown that the LOAD DATA INFILE method for populating the database is roughly 20 times faster that performing INSERT statements for the ~35,000 records contained in each source ASCII file. Modular - Data types and water years are updated/added as a unit. Scalable - New data types can be added 'on the fly'. Flexible - Queries for multiple data types allow display or downloads of data joined by sample dates and times. |
||



