Skip to content The Open University

Student and tutor module reviews

Data management and analysis

see module description

  • Points: 30
  • Code: TM351
  • Level: 3
On this page

Student reviews

I had been looking forward to doing this module from the first time I signed up with the OU.

Material: I thought the course material provided by the OU is adequate. It's all online, which I prefer.

Workload: This is a Level 3 module and I thought the workload was proportional to the level.

Tutor: My tutor was very responsive right up to the end! I was impressed.

My results: I performed well in continuous assessments. But it is the final result that I'm struggling to come to terms with (I've actually just seen the final score as I write this). I was really gunning for at least a Grade 2 pass. But I only managed a Grade 3! How? This is most disappointing! All that hard work. I feel like I wasted time.

Course starting: October 2018

Review posted: July 2019

I completed the first presentation of this course. There were many issues with the Virtual Machine provided on a USB drive prior to course commencement. Many students had problems with the software throughout the course and at the beginning the forums were awash with unresolved issues.
This is a course that could have been great, but was instead disjointed, error-ridden and extremely word heavy. A great deal of the module teaching content was spent on explaining various kinds of databases and showing how the data stored in them can be manipulated.
It expects you are already a python whizz before you start any of it as only one week is spent reviewing some basic python before diving into the use of a complicated and confusing python library.
There was a massive workload and huge swathes of material to read and digest. Additionally students were directed to read a lengthy statistics book which was linked to via the module materials. I would recommend that if you take this course, that you study it alone – it would be best not to study another course at the same time.
In our presentation we had one week's work on the KNN/K-means analytics and the statistical proofs for them. However it transpired that much of the work was built almost entirely on the content from that week, along with assuming you were an expert on statistics (another aside of the course) and a master of one or more of the databases demonstrated to set up the data for an investigation.
We were told that it wasn't a statistics course, but the final assessment required demonstration of a high level of statistical knowledge and terms. The course content went to some lengths to stress the importance of documenting progress and keeping files for an investigation and that even if the investigation found no significant result, this was still important and valuable. The expectation was then the opposite - only the written report and finding a statistically significant, mathematically provable result mattered.
With any reasonably large data set with multiple variables time was wasted attempting to "see" something to work with (and sensibly document/describe). There was also little discussion or practical guidance on how to best approach analysing a new data set, nor how to interpret the results of long-winded, frustratingly difficult and overly complex Pandas data manipulation.
The practical exercises were completed using Jupyter notebooks and many students had problems using the software because of system incompatibility.
I didn't find this course to be divisive and students mucked in together to try to help each other and resolve issues.

Course starting: January 2016

Review posted: August 2017

I really did enjoy this course, although it was quite divisive with a lot of people. TM351 covers quite a broad spectrum starting with simple details like spreadsheets, working up through relational databases to the modern NoSQL structures like Mongodb. There are also minor diversions into data cleansing and data protection, before finishing with data analysis techniques.

Because this course is marked by an EMA instead of an exam, you're not expected to remember everything and you're often given several different ways of solving a problem. The downside of this is that you are given a lot to read and digest, so you need to be disciplined with what you read and what you reference for later.

The course has a lot of practical work, with a substantial amount of the work done in Jupyter notebooks in Python, and each TMA involves work in these notebooks, which build on each other. I really enjoyed working with them, but you need to be have met the pre-requisites of the course or you may struggle.

The EMA itself is the most open-ended thing you will do at the OU until your final project. You're given a lot of data, and you're expected to investigate an area you want and write up your findings. It was something I really enjoyed, but a lot of students were unhappy with it.

I must have enjoyed the course as I'm basing my final project on it, but if you are thinking of doing this, do be prepared for the workload, and do be prepared to get your hands dirty with some difficult (at times) problems.

Course starting: January 2016

Review posted: February 2017

An interesting and good-fun module: recommended. I found coverage of the data analysis pipeline, alternative file structures - notably CSV and JSON, data-base systems, data manipulation using the Python Pandas library together with graphic display using Pandas and Matplotlib of continuing interest and fairly easy to work with. Numerous examples in iPython notebooks provide a good indication of practical application at different stages of analysis. EMA datasets were supplied but there was a free choice of the questions to be answered by analysis allowing some creativity.

Required applications are provided by a 'headless' Linux (no GUI) Virtual Machine (VM) that performed just about flawlessly hosted by the current edition of VirtualBox (not the supplied version) running under my 64bit Linux. Local-host access to the VM using Firefox gave no problems.

Forums were well moderated with most queries answered quickly. My three or four queries to my tutor got a by-return response. The iCMAs were also useful but those lacking 'correct' answers even after the cut-off date less so. I never found all the prefered roles for Data Engineer, Data Manager, Data Analyst and All, for example and still have no idea which one was counted wrong.

By chance the VM's version of Anaconda: (NumPy, SciPi, Pandas, Matplotlib, etc.) was out of date at the start of the module. Two missing packages caused minor hiccups in my EMA plan of work but hopefully later presentations will get updated software.

Functions as a data-type, substantial and nested list and dict comprehensions, vector and matrix operations from NumPy and thus Pandas all feature in required code but are not covered in prerequisite modules or Part 1 Notebooks. If you already have a firm background in these concepts no problem, otherwise be prepared for quite a lot of research outside the course materials. Currently McKinney, W. (2012) Python for Data Analysis… from the OU Library and other suggested sources look more essential than optional. A bit more top-down presentation of underlying principles would be preferable to searching for quick fixes on StackExchange I think. My tip is to get this sewn-up first and save a lot of time later.

Assessing the statistical significance of data got quite limited coverage leaving a large gap in what might be required to progress this subject further. I used comparison with software-generated random data and a duck-typed probability distribution rather than any described technique - OK for the EMA apparently.

Similarly, using a k-Nearest Neighbours classifier (and/or k-means datapoint clustering) was a requirement for the EMA. Alternatives to the leave-one-out, O(n^4) method for finding a good k for k-NN were not covered making application to anything other than quite small datasets a problem. Also, more Matplotlib examples would have been helpful. Matplotlib is considerably more flexible than the Pandas plotting covered in some detail.

Tutorial support was minimal with only three online tutorials: 'How to do TMA01, How to do TMA02, How to do the EMA.' There was also a face-to-face I did not attend, the 130 miles round trip not being considered a worthwhile proposition. Forum postings revealed that some tutors were covering more technical insight than others but for some unspecified and unfathomable reason recordings from other tutor-groups were not generally accessible. The result: a severe culture shock after other modules with 19 or more tutorials presented by a panel of tutors, accessible to all participants in up to two Regions with continuing access post module completion.

So, how hard is this module?
Material that is consistently interesting, guru and claptrap free and applicable to something one might actually like doing makes it easier than some. The amount of reading equates to other Level 3 computing modules I have done - somewhat less if 'optional' but possibly essential external sources are excluded. The modal mark was Grade 2 which is typical but distinctions were few. Perhaps report writing is not everyone's thing but EMAs are good for those who, like me, currently lack exam stamina. An explicit marking scheme related to examinable criteria gave a reasonable idea as to how things went.

I managed the top-marks band for all criteria other than 'Technical reflection on the data investigation' - not too hard then. This markdown probably resulted from the lack of an explicit review of measures taken to overcome missing data. My tip is to make sure with a summary in the report's Reflection section and not rely on stuff included elsewhere being sufficient explanation for markers.

Martin Thomas Humby

Course starting: January 2016

Review posted: December 2016

Please note

Each of the views expressed above is an individual's very particular response, largely unedited, and should be viewed with that in mind. Since modules are subject to regular updating, some of the issues identified may have already been addressed. In some instances the faculty may have provided a response to a comment. If you have a query about a particular module, please contact your Regional Centre.

Enter a module code to find a review

To send us reviews on modules you have studied with us, please click the sign in button below.