вход по аккаунту


How to download and load CSO data into PostgreSQL

код для вставкиСкачать
Lab 8
Getting CSO data into PostgreSQL This lab describes how to load the CSO files into dublin_eds table. You should only use this lab on your own machine. Do not do this lab in the college labs. It is important that you know how CSO data is loaded into PostgreSQL
The CSO census for 2006 consists of over 70 themes. Each of these themes contains several components. We will use three steps to get CSO data into PostgreSQL
1) Download a particular topic for a given area. We will download the population for the Dublin Electoral Divisions.
2) Load the topic into a table specifically designed for that topic.
3) Move the data to the Electoral Division (ED) table that contains all topics and the geometry of each ED. You must have dublin_eds in your database, see Lab1 for details on how to load an SQL file itno postgreSQL.
4) We will also look at view the well know text formats of the OGC standard for POINT, LINE, POLYGON, etc.
Go to the CSO Census 2006 reports page for loading into PostgreSQL:
Follow the following screen shots:
Scroll to Dublin City
Select Dublin City and download as a CSV file.
Save the file as csopop.csv in C:\Program Files\PostgreSQL\8.4\bin
Edit csopop.csv in TextPad or Notepad and delete the header
At the end of the file make a new line and add a backslash and a full stop (\.)
These two characters act as a terminator in a CSV file.
Each generic theme has one or more components. It is the theme components that actually get downloaded. Part of Theme 1 is population broken down by males and females. The following SQL creates a table for just the population information for the Dublin EDs. It does not contain geometry. Selected columns from the table will later be loaded into the big ED table called dublin_eds. You must have dublin_eds in your database.
(geographic_area character varying(50),
total character varying(10),
persons integer,
males integer,
females integer);
In Windows-XP load the CSV file by typing the following into the SQL-Shell.
COPY csopop FROM
E'\C:\\Program Files\\PostgreSQL\\8.4\\bin\\csopop.csv' WITH CSV;
In Windows-7 or VISTA load the CSV file by typing the following into the SQL-Shell.
COPY csopop FROM
E'\C:\\Program Files (x86)\\PostgreSQL\\8.4\\bin\\csoedu.csv' WITH CSV;
You now have the population in PostgreSQL. Do a test query.
select geographic_area, sum(males+females) from csopop group by geographic_area;
To copy data from csopop to dublin_eds do the following.
update dublin_eds set male1_1 = ( select males from csopop where saps_label = geographic_area);
update dublin_eds set female1_1 = ( select females from csopop where saps_label = geographic_area);
Now check the population of each ED
select saps_label, (male1_1 + female1_1) from dublin_eds;
How many records in dublin_eds should be updated?
Can we check where updates did not occur?
select saps_label, male1_1 from dublin_eds where male1_1 is null;
select geographic_area from csopop where geographic_area like '%Kevin%';
select saps_label from dublin_eds where saps_label like '%Kevin%';
select saps_label,(female1_1+female1_1)from dublin_eds where saps_label like'%Kevin%';
The CSO census for 2006 consists of over 70 themes. Each of these themes (Table 1) contains several components (Table 2). The dublin_eds table has columns for 7 themes. The table dublin_eds is in student distrib. The table dublin_eds contains the following Themes:
Theme 1 Sex, age and marital status
Theme 1 - 1:Persons aged 15 years and over by age education ceasedTheme 8 Economic Status
Theme 8 - 1 : Economic StatusTheme 9 Social Class and Socio-Economic Group
902,Theme 9 - 2: population by socio-economic group of reference person.Theme 10 Education
Theme 10 - 4: Persons aged 15 or over by sex, principal economic status and highest level of education completedTheme 11 Commuting
1101,Theme 11 - 1: Persons aged 5 and over by means of travel to work, school or college1501,Theme 15 - 1:
Number of motor cars available to private households1502,Theme 15 - 3:
Number of private households with internet access by type of access Table 1: The 7 CSO-themes for this assignment
In more detail the data types for each component are:
Description & table name for themesColumns names and data typesPersons by age and sex.
Male and female totals only
your_ed_theme1_1male1_1 integer,
female1_1 integer,Economic status and education
Switch Table Dimensions and total of males plus females only.
at_work8_1 integer,
looking_for_first_regular_job8_1 integer,
unemployed8_1 integer,
student8_1 integer,
looking_after_home_family8_1 integer,
retired8_1 integer,
unable_to_work8_1 integer,
other8_1 integerSocial class
your_ed_theme9_2aemployers_and_managers9_2a integer,
higher_professional9_2a integer,
lower_professional9_2a integer,
non-manual9_2a integer,
manual_skilled9_2a integer,
semi-skilled9_2a integer,
unskilled9_2a integer,
own_account_workers9_2a integer,
farmers9_2a integer,
agricultural_workers9_2a integer,
others9_2a integer
Education Level: Switch Table Dimensions and total of males plus females only.
formal_education10_4 integer,
primary_education10_4 integer,
lower_secondary10_4 integer,
upper_secondary10_4 integer,
technical10_4 integer,
upper_secondary10_4 integer,
non_degree10_4 integer,
primary_degree10_4 integer,
professional_qualification10_4 integer,
both_degree_and_professional_qualification10_4 integer,
postgraduate_certificate_or_diploma10_4 integer,
doctorate10_4 integer,
not_stated10_4 integer,Commuting
your_ed_theme11_1foot11_1 integer,
bicycle11_1 integer,
bus11_1 integer,
train11_1 integer,
motorbike11_1 integer,
car11_1 integer,
car_passenger11_1 integer,
other11_1 integer,
not_stated11_1 integerCar Ownership
your_ed_theme15_1No_car15_1 integer,
one_car15_1 integer,
two_cars15_1 integer,
three_or_more_cars15_1 integerInternet Connection
your_ed_theme15_3broadband15_3 integer,
other_connections15_3 integer,
no_internet15_3 integer,
not_stated15_3 integerTable 2 : CSO-themes and components: table/column names & simplifications
Viewing well known text formats for OGC data types
Start PgAdmin and query window:
Type the queries below into the SQL query window:
select astext(the_geom) from roads;
select astext(the_geom) from county;
select astext(the_geom) from dublin_highway1;
select astext('LINESTRING(0 0, 10 10 )');
select astext(the_geom) from dublin_historical;
Без категории
Размер файла
241 Кб
Пожаловаться на содержимое документа