Neuroscience International

Big Data and Parkinson’s Disease - from Understanding of Disease to Personalised Medicine

Yen F. Tai and Nicola Pavese

DOI : 10.3844/amjnsp.2016.1.3

Neuroscience International

Volume 7, Issue 1

Pages 1-3


The advancement ofinformation technology in recent years has generated a huge amount of datasetsin all areas of scientific research. This has led to new opportunities toharness the ‘Big Data’ to improve our understanding of diseases - from investigatingdisease mechanisms to monitoring treatment responses.

The Big Data approach is a data-driven and often hypothesis-free, way ofstudying a disease. It generally involves multiple research centres poolingtogether resources and setting up consortia to generate large-scale datasets. Standardisedprotocols are used to acquire data that traverse molecular, genetic, clinicaland imaging domains, hence allowing integration of clinical and biological datafor large scale cohort study. Such standardised approach also enables pooleddata from different centres to be studied and compared, which greatly increasesstatistical power. The datasets are usually made accessible to the widerscientific community, further enhancing scientific collaboration and transparency.It also reduces redundant/duplicative efforts and is potentially a more costeffective way to conduct large-scale scientific studies (Editorial, 2014).

Such approach has beenparticularly productive in the field of genomics. Genome Wide AssociationStudies (GWAS) involving thousands of patients have successfully identifiedsusceptibility genetic loci for Parkinson’s Disease (PD) (Simon-Sanchez et al., 2009). In itssimplest form, GWAS analysis, as representative of the basic analyticalapproach to processing Big Data, aims to identify genetic variants such assingle-nucleotide polymorphism that distinguish a population with a particulartrait or disease from a control population. For diseases with more complexgenotypic and phenotypic components such as PD, network analysis hasincreasingly been used to identify a group or network of interacting geneswhich may be implicated in disease pathogenesis (Leiserson et al., 2013). Such network analysis hasalso been applied in imaging studies to investigate networks of anatomicallydispersed brain regions and their roles in diseases. A number of statisticalapproaches including linear regression, logistic regression, principlecomponent analysis and latent class analysis have been employed to analyse theselarge datasets (Wang et al., 2014)but it is beyond the scope of this editorial to explain them in detail.

Parkinson’s ProgressionMarkers Initiative (PPMI), sponsored by the Michael J Fox Foundation forParkinson’s Research, is a multi-centre collaborative observational study thatcollects clinical, behavioural, imaging, genetic and biological sampling fromcohorts of significant interest - and these include de novo PD patients andparticipants at high risk of developing PD. It provides a standardised and longitudinalPD database and biorepository which are open to the wider scientific community.It has the stated aim of finding one or more biological markers for the diseaseas the critical next step towards developing new treatments (PPMI, 2014).

Several high-profile brain imaging collaborative studies including HumanBrain Project, Brain Activity Map and BRAINinitiative (Brain Research through Advancing Innovative NeurotechnologiesInitiative) have been set up to better understand brain functions, either bycreating a large-scale computer simulation of the brain or by establishing afunctional connectome of the brain, with the hope that this will eventuallylead to a cure for neurodegenerative disorders such as Alzheimer’s andParkinson’s disease (Kandel et al.,2013).

These large-scale studies can be costly and there aredisagreements within the scientific community on how best to run them andwhether they will achieve the stated aims (OMECCHBP, 2014). There are alsostatistical and computing challenges when processing such large volume of data.Many statistical models do not account for possible interdependence of themultiple parameters being sampled and this could lead to reduction of thedegree of freedom and violation of some statistical principles. The statisticalmodels themselves may also introduce a degree of bias and false discovery (Wanget al., 2014).

Critics have alsoargued that the Big Data approach can identify correlations between differentdisease parameters but it does not necessarily establish true causalassociations and they struggle to see how this will lead to finding a cure forthe diseases in question. Proponents of the Big Data approach counter-arguethat such ‘signals’ are crucial in inspiring more hypothesis-driven studies tofurther evaluate the correlations and improve our understanding of diseases (Husain,2014).

It is important torealise the benefits and limitations of the Big Data approach in studying adisease. It allows a large-scale, unbiased study of data obtained from molecularto individual levels. It does not supplant independent research conducted byindividual research groups, which are crucial in generating ideas andhypotheses to complement the Big Data approach. The correlations or ‘signals’obtained from the Big Data approach require further targeted studies toelucidate the underlying molecular mechanisms.

Apartfrom its role in helping us to understand diseases at the population level, theBig Data approach can also be applied, albeit on a smaller scale, on an individualbasis to monitor disease fluctuations or treatment responses especially in adisease with marked between- and within-individual variability like PD. As thedisease progresses, most PD patients will develop motor complications such aswearing-OFF, ‘ON-OFF’ fluctuations, dyskinesias and gait freezing. Theemergence of these symptoms reflects fluctuations in synaptic dopamine andother neurochemical levels in the brain. PD treatments, especially dopaminereplacement therapy, will need to be titrated on an individual basis and,ideally, tailored to specific symptoms at a particular time.

The traditional methodsof monitoring fluctuating PD symptoms based on patients’ or carers’ history, orPD diaries, are laborious and can be misleading as they rely on patients ortheir carers to accurately identify various ‘OFF‘ or ‘ON’ symptoms, e.g.,tremor versus dyskinesias. Clinic reviews provide only a snapshot of thepatients’ symptoms and signs in a rather artificial and potentially stressful,setting. There can also be discordance between treatment responses rated bypatients and their physicians (Davidsonet al., 2012).

Mobile or wearable devices worn on patients’ limbs or body are beingdeveloped to detect ‘ON’/‘OFF’ limb movements, balance deficits and gaitdisorders in PD patients using specific algorithms. They aim to allow thetreating neurologists to monitor an individual’s PD symptoms continuously,remotely and objectively in real-life situations (Maetzler et al., 2013). Recently, Michael J FoxFoundation announced a collaboration with Intel to use wearable devices tomonitor symptoms of PD patients. The devices record more than 300 observationsper second from each patient and a Big Data platform has been developed byIntel to analyse the volume of data that will be generated (MJFFPR, 2014).Similarly, Parkinson’s UK has also teamed up with Global Kinetics Corporationto provide a wearable device Parkinson’s KinetiGraph in a 12-month pilotproject (EPDA, 2014).

This type of data withsmall sample size (individual patient) and high dimensionality (multiplemeasurements or parameters) is more susceptible to noise accumulation andspurious correlations (Fan et al.,2014). One way to get round this problem is by pre-processing the rawdata to extract a more manageable secondary dataset of interest (Wang et al., 2014). While the statisticalmethods and algorithms involved might seem complex to most clinicians, theoutcome generated is generally user-friendly (e.g., indicating dyskinesiasversus bradykinesia) so most clinicians should not require extensive trainingor IT knowledge to avail themselves of the devices. The more commonly usedwearable devices, which are applied on the patients’ forearm or wrist, are goodat detecting motor fluctuations involving the monitored limb but are lesssensitive at detecting gait disturbances or postural instability. To detectthese abnormalities, one would often need to apply monitoring devices onpatients’ trunk or leg and they may be perceived as more intrusive by patients.

Furthervalidation studies are required to verify the accuracy of these devices indetecting or interpreting various clinical parameters before they can be usedin routine clinical practice. We also need more evidence to show that suchinterventions can lead to better patient outcome.


The Big Data approachhas the potential to revolutionise medical research by improvingstandardisation and collaboration across different centres, enhancingstatistical power and efficiency of medical studies. It also needs to be complemented by morespecific hypotheses-driven studies. On an individual level, such approach canhelp to better monitor symptoms and lead to personalised treatments for PDpatients.


Davidson, M.B., D.J. McGheeand C.E. Counsell, 2012. Comparison of patient rated treatment responsewith measured improvement in Parkinson’s disease. J. Neurol. NeurosurgPsychiatry, 83: 1001-1005. DOI:10.1136/jnnp-2012-302741

Editorial, 2014. Consorting withbig science. Nat. Neurosci., 17: 1289-1289. DOI: 10.1038/nn.3830

EPDA, 2014. Parkinson's UK andGlobal Kinetics Corporation collaborate to provide promising new mhealth technologythroughout UK. European Parkinson's Disease Association.

Fan, J., F. Han and H. Liu, 2014. Challenges of big data analysis. Nat.Sci. Rev., 1: 293-314. DOI:10.1093/nsr/nwt032

Husain, M., 2014. Big data: Could it ever cure Alzheimer's disease? Brain,137: 2623-2634. DOI: 10.1093/brain/awu245

Kandel, E.R., H. Markram, P.M.Matthews, R. Yuste and C. Koch, 2013. Neuroscience thinks big (andcollaboratively). Nat. Rev. Neurosci., 14: 659-64. DOI: 10.1038/nrn3578

Leiserson, M.D., J.V.Eldridge, S. Ramachandran and B.J. Raphael, 2013. Network analysis of GWASdata. Curr. Opin. Genet. Dev., 23: 602-610. DOI:10.1016/j.gde.2013.09.003

Maetzler, W., J. Domingos, K.Srulijes, J.J. Ferreira and B.R. Bloem, 2013. Quantitative wearable sensors forobjective assessment of Parkinson's disease. Mov. Disord., 28: 1628-1637. DOI:10.1002/mds.25628

MJFFPR, 2014. The Michael J. FoxFoundation and Intel join forces to improve Parkinson's disease monitoring andtreatment through advanced technologies. The Michael J. Fox Foundation for Parkinson's Research.

OMECCHBP, 2014. Open Message to theEuropean Commission concerning the Human Brain Project.

PPMI, 2014. Landmark study to findbiomarkers. Parkinson's Progression Markers Initiative.

Simon-Sanchez, J., C. Schulte, J.M.Bras, M. Sharma and J.R. Gibbs et al.,2009. Genome-wide association study reveals genetic risk underlying Parkinson'sdisease. Nat. Genet., 41: 1308-1312. DOI: 10.1038/ng.487

Wang, W. and E. Krishnan, 2014.Big data and clinicians: A review on the state of the science. JMIR Med.Inform. DOI: 10.2196/medinform.2913


© 2016 Yen F. Tai and Nicola Pavese. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.