Now it's time to write your own query in two parts - the columns you want to select, and the conditions you want each returned result to satisfy. So for example, the following table show possible queries over the Guardian Datastore MPs' Expenses spreadsheet (key: rvWgEEGK9xuUQBR1EFcxHWA)
| select input | where input | Comments |
| * | | Display everything |
| B,C,I | | Display columns B, C and I |
| B,C,I | I=23083 | Display columns B, C and I for MPs claiming exactly 23083 in column 83 |
| count(I) | I=23083 | Count how many people claimed exactly 23083 in column I |
| B,C,I | I!=23083 order by I | display the people who did not claim exactly 23083 in column I and display them in increasing order of column I values |
| B,C,I | I!=23083 order by I desc | display the people who did not claim exactly 23083 in column I and display them in decreasing order of column I values |
| B,C,D,E | (C contains 'Joan' or C matches 'John') | Select by name (case sensitive); 'matches' must match exactly, 'contains' is a free text search |
| * | F<100000 | full details of everyone who claimed less than 100000 in column F |
| sum(I) | | Total claimed within column I |
| count(I) | | number of rows where there's a value in column I |
| count(I) | where I>=0 | make sure we only count 'valid' rows. |
| sum(I)/count(I) | | Calculate the average amount claimed in column I |
| D,sum(I) | I>=0 group by D | Find out how much has been claimed by each party named in column D |
| D,sum(I)/count(I) | I>=0 group by D | for the total claimed by each party (column D), how much on average does each member of that party claim |
Try out some visualisation queries here...