A Citizen Led Research Approach to Fighting Air Pollution


All of us remember the old saying .”United we stand, divided we fall” from the book Aesop and his fables. If we just tap our memories of the stories of “Four Oxen and the Lion” and “The Bundle of Sticks”, we would immediately relate to the need of being united to fight for a cause.

In this age of internet and social media with digital content being ubiquitous, the being united quotient has considerably reduced. That holds true for societies, communities and even within families. There are a whole lot of likes happening on social media posts, memes and photographs but not enough conversations. As a result, the ability to forge a concerted response to the living situations around us has considerably declined.

This has affected the ability to forge a concerted response to issues relating to pollution, environmental degradation and climate change realities. Social media helps, but in trying to grapple with multiple platforms and limited bandwidth, the ability to engage becomes severely restricted. This makes it easier for Governments to dismiss the few lone voices that vent dissent and disagreement on matters vital to sustainable living. Air pollution is one such issue which is very much our problem regardless of gender, age, community, religion, income strata and any such categorization that functions as vote banks.

The India Air Pollution Survey is a research backed initiative that Citizen Ecofinalytics has taken up to involve citizens to find their voices, express it and become a united community for change. This is a totally citizen backed movement, which means not only do you express your concerns, but also involve persons around you. In doing so, each of us become citizen leaders that are united for a cause.

Here is the link of the survey.

We hope that this is a beginning to being “Citizens United For a Cause”


7 reasons why Nonprofits need to invest in research

Ask any nonprofit organization, why they need research, and possibly pat will come the reply, well to get funding of course. Some replies may verge on sarcasm on donors holding the key to survival of the organization and some will hint on the possibility of loss of employment. The research and funding relationship can be very simply written in the equation form as “Evidence = Funding:”

The fact of the matter is that nonprofits are not sellers of products or services, they are development or change agents. That agency needs to be measured and evaluated on some performance matrix so as to assure their patrons, governments and society at large that their contribution to society is indeed authentic, impacting and sustainable.

The direct relationship of research or evidence to funding should therefore not be the only reason why nonprofits need to invest in research. In fact research needs to be an integral part of an nonprofit’s organizational structure. Given below are 7 reasons why nonprofits need to invest in research.

  1. It helps to clarify the overall vision and mission for the organization. Each organization is different and even though they may work in certain common thematic broad areas, how they work individually can be very different. To take an example, women empowerment can be a broad overarching goal for an organization, but it has to be fine- tuned so as to be able to spell out how the organization is going to reach out to create impact for potential beneficiaries.
  2. Specific interventions for particular projects and their stakeholders in identified geographical areas need to be adequately designed to account for what will work and what will not. This requires a theory of change for each particular project or program. A comprehensive literature review helps to clarify and identify the assumptions used for formulating the theory of change and the likely trajectory of change for beneficiaries.
  3. To be able to measure change or impact, it is required to formulate appropriate indicators to keep track of the inputs, processes, outputs and outcomes. Simply saying that the project will have an impact on the knowledge, attitude and practice (KAP) of beneficiaries, is not enough because one also has to formulate a construct that measures the KAP in that context. This requires intensive research and validation for each concept and construct along with devising appropriate measurement scales.
  4. There are some organizations that work directly with the stakeholders, but there are others who rely on a host of partners to reach across to the ultimate beneficiary. Each of these field organizations are bound to have different organizational structure and operating philosophies. Therefore the need to ensure some kind of alignment as per the goals and objectives of the program through monitoring of their functional and operating processes and outputs.
  5. A fair number of nonprofits are involved in advocacy campaigns related to various citizen or community related causes. For these organizations too, research is vital to give insights about the community that they work with. This can relate to their social, cultural and demographic structure, their economic situation and their concerns on the phenomena that affects their lives in positive or negative ways.
  6. Each nonprofit requires active engagement with its constituents including the need to magnify their presence. Social media therefore is an integral and important part of their communication strategy especially if the organization is involved in advocacy campaigns related to human rights, environment, sustainability, women causes that by their nature are mass based. This also holds true for disaster relief organizations. There is thus a trying need to be able to track the reach and impact of social media as well as do a commensurate sentiment analysis of the same. Research related to communication strategies is therefore extremely important. In fact nonprofits that use behavioral communication or more aptly called social behavior change communication invest enormously in researching how communication can lead to behavior change.
  7. While most research is devoted to program design, delivery and evaluation, it is equally important to research the donor community and the level of financial support they provide. This is especially true when nonprofits rely on multiple donors and not just few institutional donors. Donors can be categorized in multiple ways, among them being the forms of contribution, the quantum, the geographical segregation, and as per cause.

All organizations need research, the nonprofit can then be no different. Unfortunately, donors and philanthropists focus majority of their budgets on programmatic interventions leaving only a very small portion for research. This often is so meager that it scarcely allows for robust monitoring and evaluation systems. In most cases, program expenditure will inevitably scavenge on the research one. Funding tied to evidence is no doubt extremely important but nonprofits themselves need research to understand their own work and its impact which can be include both intended outcomes as well as unintended outcomes. Constant listening on the ground is critical to their own growth strategy and alignment to the organizations’ vision and mission. Hence, all nonprofits must keep a dedicated budget for research regardless of the type and size.


Research and Statistical Consulting

It has been now 5 years working in the area of statistical and research consulting and there are valuable lessons that have been learnt from numerous projects executed and the interactions with clients.

My expertise has been honed more as a quantitative researcher but over the years, one finds that qualitative skills of listening on the ground are as necessary than the carefully constructed answer options given as yes no answers or those given on Likert scales.

More than anything else, statistical and research consulting is all about understanding the client requirements and their limitations. Building that understanding requires having multiple conversations, and the numerous back and forth to arrive at alternative scenarios and answering tons of questions, why this is necessary and why this is not. The client is often uncertain, has limitations of bandwidth, is cash strapped and has to satisfy and pacify a number of internal and external constituencies.

In order to be able to satisfy the client, a research consultant must have the following qualities:

  1. Would like to solve real problems and also empower others to solve problems as well. This often requires the researcher to don the hat of capacity building as well
  2. Be an active listener and ask the required but direct questions in a way that enables the client to think deeply
  3. Be able to put forward the problem in clear and simple terms. This may often require multiple rounds of presentations
  4. Must have a broad knowledge and clear understanding of statistical and scientific methods
  5. Be willing to read, research existing literature and approaches that relate to the clients requirements and area of problem
  6. Be willing to apply existing statistical approaches and design to different environments
  7. Be able to write and communicate effectively the essence of the findings in a manner that the client is able to understand
  8. Ensure that estimates are statistically robust and that recommendations have taken a view of literature on that subject.
  9. Be prepared to give solutions that are actionable for the client while solving the clients’s problems in as much entirety as possible. This is something that needs an emphasis with a double underline.
  10. Avoid recommending to the client research shortcuts that compromise the quality of data or the quality of findings and will ultimately make the client look bad. Make sure that due process of research is followed with documentation of references and discussions
  11. Ensure that if possible get some experience in the actual collection of data or at least if you get only the data, ask questions of who collected the data, whether training was done, what steps were taken to avoid investigator bias, what sampling method was followed and so on
  12. Always take a bit of extra time to check and double check whether you have checked your results and procedures. Sometimes mistakes happen when you have been doing this over and over again and in the hurry to meet the client brief, you forget to double check that tiny coding result or the tiny analysis step. Remember, even it does not make a material difference to the result, still try and ensure that research standards are met.
  13. Be able to communicate effectively in writing as well as orally the final results. Often the client may be looking at the solution in a different way and you have to explain why your approach may be more reasonable and accurate. At the same time be flexible that the client has a point because at the end of the day, he is the one who is probably living and breathing the problem
  14. Be able to make a good estimate of how much effort will be required to solve the problem without actually having to solve the problem itself. That is sometimes the most difficult aspect because you have to bill the client. The client may not realize the intricacies of what is involved and the need to clean data, check its accuracy, run background checks, see what methodology works, performs statistical testing and robustness of estimates, write down results in a way that is accurate and understandable to the client. The client wants results fast, at rock-bottom prices and thinks that all of you have to probably wave a magic wand to get this done in a jiffy. That is the subject of persuasion, conversation and negotiation.

Research consulting is like any other consulting, but sometimes more arduous, because it is evidence backed and requires a fair degree of programming, testing and alternative ways of a looking, structuring and solving a problem. But the road map defined above may well help you to be a good research companion to your clients.


Data Analytics and Precision Medicine

Precision Medicine: Data analytics has ventured into another vital area that of using data to deliver personalized health care to millions of citizens. Personalized Medicine has given way to ‘Precision medicine’ and is  a term used to describe a transformative model of health care that involves the selection of diagnostic tests that have the potential to identify changes in each individual’s patients cells and to customize treatment as per individual requirements. Ultimately, the goal of precision medicine is to improve patient outcomes. Having said that while there has been reasonable amounts of success in the application of big data, there are still certain major hurdles in its universal adoption which stem from the economic, technical, regulatory and human aspects involved. The NAS report defines “precision medicine,” — the use of genomic, epigenomic, exposure, and other data to define individual patterns of disease, potentially leading to better individual treatment. Precision Medicine is much more than personalized medicine and is meant to convey a more accurate image of diagnosis that is person-centered and multifaceted.

Data Analytics and Precision Medicine: Data Analytics or Big data in precision medicine essentially refers to the use of data science techniques to capture and analyze huge and complex datasets in order to positively impact patient care outcomes, and optimize business processes. While the term, big data, may seem to reference the volume of data, that’s not necessarily the case. Big data may also refer to the extent of technology that an organization requires to handle large amounts of data, as well as the needed facilities to store it. The healthcare industry produces large amounts of clinical, financial, administrative and genomic data and needs big data techniques to manage it.

Healthcare Data: Health care data uses a variety of data and includes Web and social media data (interaction data from Facebook, Twitter, LinkedIn, blogs, health plan websites, and smartphone apps), Machine-to-machine data (information from sensors, meters, and other devices), Transaction data (healthcare claims and billing records in both semi-structured and unstructured formats), Biometric data (fingerprints, genetics, handwriting, retinal scans, X-rays and other medical images), Human-generated data (Electronic Medical Records (EMRs), physicians’ notes, email, and paper documents), Pharmaceutical R&D data( related to a drug’s mechanism of action, target behavior in the human body and side effects)

The Electronic Health Record (EHR) is a longitudinal systematic collection of electronic health information for a patient generated by one or more interactions in any care setting. In order that they may be accessed it is essential that the same should be shareable across different health care settings. An EHR typically includes information such as Patient demographics, Medical history, Medications and allergies, Immunization status, Laboratory test results, Radiology images, Vital signs, Personal statistics like age and weight, Progress notes and problem details and Billing information. The Electronic Medical Record (EMR) is a part of EHR and refers to the digitized version of the paper chart in clinician offices, clinics, and hospitals. The EMR contains notes and information collected by and for the clinicians in that specific office, clinic, or hospital setting and is mostly used by providers for diagnosis and treatment. The term Personal Health Record (PHR) refers to EHRs that are designed to be set up, accessed, managed and controlled by patients in a private, secure and confidential environment. They usually contain health information generated by clinicians, home-based monitoring devices, and patients themselves.

Big Data has been used for a considerable time in the health care industry. Some of the current applications of big data have involved the processes of collecting and aggregating the vast amounts of patient data produced from a variety of sources for analyzing hospital systems, effectiveness of patient care, return on investment of business processes and general business intelligence.  The use of data analytics will however go a step further in using the data collected to improve patient outcomes through use of advanced clinical analytics to enhance proactive care, enhancement of clinical decision-support using analysis of current knowledge databases, improvement of clinical trial design with the use of statistical tools and algorithms and finally building better models of personalized medicine through the analysis of large data sets.

Applications of Data Analytics in Precision Medicine: It is suggested that data analytics can be used in six very practical cases. They are high-cost patients, readmissions, triage, decompensation (when a patient’s condition worsens), adverse events, and treatment optimization for diseases affecting multiple organ systems (such as autoimmune diseases, including lupus).

  1. High-Cost Patients: As per a study done in the US, approximately 5% of patients account for 50% of all US Health Spending. Therefore identification of high cost patients is paramount. To identify high risk or high cost patients may require inclusion of attributes such as behavioral health problems or socioeconomic factors such as poverty or racial minority, marital and living status in the algorithms developed. Algorithms are most effective and perform best when they are derived from and then used in similar populations. A very critical aspect of identification of high cost patients is the use of behavioral data since it is found that a large portion of patients at high risk for hospital admission have some sort of behavioral health issue, depression being most especially frequent. For instance, the standard approach may be to give all patients who are discharged from the hospital a follow-up appointment in two weeks. But it might make more sense to ensure that the highest-risk patients are seen within two days, while patients with very low risk might require follow-up care only as needed. Algorithms can help reallocate resources more effectively at both the high-risk and low-risk ends of the spectrum.
  2. Readmissions: As many as one-third of readmissions have been regarded to be preventable and, therefore present a significant opportunity for improving care delivery. Health care organizations can use a predictive algorithm to predict who is likely to be readmitted to the hospital. The important result differentiation would consist of tailoring the intervention to the individual patient, ensuring that patients actually get the precise interventions intended for them, monitoring specific patients after discharge to find out if they are having problems before they decompensate, and ensuring a low ratio of patients flagged for an intervention to patients who experience a readmission (that is, a low false positive rate). It may also make sense soon to ask patients with a smartphone to allow health care organizations to access data from their phones that will help identify patients who are not managing a chronic condition well or that will monitor people recently discharged from the hospital, since it appears that patients who are not making calls or sending e-mail with their usual frequency may be depressed or suffering from other issues. Patients may also be asked to wear some type of device that monitors physiological parameters, such as heart rate or rhythm. These data will be most effective in informing health care decisions if they are processed with analytics.
  3. Triage: Estimating the risk of complications when a patient first presents to a hospital can be useful for a number of reasons, such as managing staffing and bed resources, anticipating the need for a transfer to the appropriate unit, and informing overall strategy for managing the patient. In newborns and many other populations, using modern big-data techniques that combine routinely collected physiological measurements makes much more accurate assessments possible with a minimal burden of training and implementation. For example using maternal data to assign a preliminary probability of early onset sepsis. Similarly for adult high risk patients, clinicians in the emergency department may be provided with two composite scores that have been calibrated using millions of patient records and that are applicable to all hospitalized patients, not just those in intensive care. The first of these scores summarizes a patient’s global comorbidity burden during the preceding twelve months; the second captures a patient’s physiological instability in the preceding seventy-two hours. In addition, these two scores, available in real time, are combined with vital signs, trends in vital signs, and other information, such as how long a patient has been in the hospital. If the information collectively indicates that a patient has ≥8 percent risk of deteriorating in the next twelve hours, an alert is sent to the responsible providers.
  4. Decompensation: Often before decompensation—the worsening of a patient’s condition—there is a period in which physiological data can be used to determine whether the patient is at risk for decompensating. Much of the initial rationale for intensive care units (ICUs) was to allow patients who were critically ill to be closely monitored. A host of technologies are now available that can be used to monitor patients who are in general care units, in nursing homes, or even at home but at risk of some sort of decompensation. Monitors are becoming available in which multiple data streams can be compared simultaneously, and analytics can be used in the background to determine whether or not the signal is valid. One example of these new monitors is a device that sits under the mattress and that collects data about the patient’s respiratory rate and pulse and whether or not the patient is moving. The data are transmitted to a server, where analytics are used in real time to determine if the patient appears likely to be decompensating. When the system detects a likely decompensation, an e-mail message is sent to an on-duty nurse’s smartphone. With this system, the likelihood that a true decompensation is present has been increased to approximately 50 percent—far better than for cardiac telemetry, for which it is typically 5–10 percent.
  5. Adverse Events: Another use case for analytics will be to predict which patients are at risk of adverse events of several types. Adverse events are expensive and cause substantial morbidity and mortality, yet many are preventable. It seems likely that analytics could be combined with data about exposures to specific medications and with measures of kidney function, blood pressure, urine output, and other processes to identify patients at risk of decompensation. Analytics can also be effective in managing infection. One example involves monitoring and interpreting changes in heart rate variability for detection of major decompensation in infants with very low birthweights before the emergence of an infection. Monitoring the heart-rate characteristics of newborns alone has already resulted in reductions in mortality and increases in the number of ventilator-free days. However, there is room for improvement using increasingly sophisticated analytics that account for subtle signals but also filter out extraneous patterns, such as those that occur when the baby moves.

Challenges for Data Analytics in Precision Medicine: However there are also various factors that inhibit the use of big data in analytics. Some of them can be cited as: A resistance to a systems-approach by the medical community; acute shortage of professionals in data analytics in health care; lack of comparable and transparent data in health care; financial constraints in adoption of big data analytical systems and finally the lack of interoperability between health care systems.

The perhaps most difficult challenge to be addressed in a Precision Medicine research program is that it must have a value proposition that is compelling to all participants’ i.e.  data providers, patients and healthcare organizations. With the rapid growth and developments in genomics research, data science and electronic health records, it is possible to integrate an individual’s information into their clinical care for precision medicine. To address this opportunity, it is necessary to have a large cohort which is participant centric in concept so that it is possible to share structured data with participants, researchers and clinicians and at the same time is transparent in purpose, management and execution.

Another challenge relates to the lack of a practical mechanism to uniquely identify participants. The integrity and value of EHR data depends fundamentally on being able to unambiguously link it to one and only one individual whose healthcare it represents. This is sometimes achieved by matching records based on several characteristics (probabilistic matching) instead of using a unique identifier, which circumvents but does not eliminate privacy concerns. Unintentional duplicate records, name ambiguities, and utilization of another person’s identity to obtain care all contribute complexity to the seemingly straightforward task of linking data to the correct individual. This problem grows in proportion to the number of individuals who participate and the number of different care organizations that generate data about those individuals.

To complicate things, existing consortia that extract, transform, “clean” and analyze clinical data for research purposes have extensively depended upon the local expertise of informatics teams at each of the data contributing sites. Such expertise is currently unavailable in most healthcare institutions except in large academic medical centers. The paucity of experts constitutes a bottleneck in the development of a national PM cohort that includes all care settings.

Further, there will be certain technical specifications that must be methodically drafted and all possibilities integrated. To make progress toward a reliable research data resource based on individuals accessing their records as study participants, there is a need for a definition of a common structure for granular research-related data, have agreed-upon definitions (semantics) for key research data, and finally a development of an extensible “comprehensive electronic data format” that will accommodate both structured and unstructured data.

Transforming the realities of real-world EHR data into interoperable, standardized data structures requires significant vendor-to-vendor and site-to-site interpretation. EHR vendors and healthcare providers must be given guidance and testing capabilities regarding how to configure systems for these new requirements.


Research Methodology in the Classroom

I began delving into research methodology actively in the classroom some 15 years ago purely out of passion. I was and still am an economics and finance professor, and therefore it seemed more logical that I would be interested more in the performance of stocks and investment products rather than something related to primary research. But it seemed that the moment I got a chance, I was finding creative ways to devise projects that had some or the other element of figuring out what people did, what they thought, how they made decisions, what went on in their life and so on. Initially I was doing a fair amount of primary research around investment decisions such as those pertaining to mutual funds, stocks, pension or insurance or something related to risk profile of investors. These research projects were limited to students doing their summer or final projects and were extensions to their main project.

Thereafter I decided to throw the towel in and decided to enter the hallowed portals of a research methodology class. Very soon one ventured into a whole host of areas of research which was as inane as what factors and features would people choose cars, toothpastes, shampoos, hair oil, mobiles on, how much would they spend on them, and what was the brand recall.

We also did things that were somewhat not totally business based such as what kind of causes would people like to support, does celebrity endorsement matter for specific products, why do people cheat in exams and what goes in their minds when doing so, and even something as profane as what language is most preferred when you want to swear so that it really hits hard.

And then there were some topics that were public health based such as what kind of ailments do people living in slum regularly suffer from, how much do they spend on the same and whether their choice of primary medical provider was a private practitioner or a government hospital. I remember even doing one on self-medication for various ailments, its frequency and the most preferred choice of treatment as per ailment.

All these projects were done with students and we spent countless hours devising and revising questionnaires. Those were days when google surveys were not there and the possibility of adding logic in questions was really difficult. We squeezed questions in maximum three pages so that no survey would take more than 15 minutes inclusive of demographic questions. Those instruments were hard work but what really mattered was getting responses on those questions. We sent these groups out on field trips each group given a target of 100 surveys, 20 each to a student with some instructions on quotas for female and male respondents, age and income categories. But however much we tried, we trusted, only some students went to the field, most simply went home or hostel and self-filled it. As a result, very few projects got converted into full-fledged research papers, simply because we failed on one parameter, ethics in data collection and representative sample selection.

I shifted jobs and for six years, there was a lull in research methodology not because I did not want to teach it, but because the institutional leadership only wanted me to teach finance, economics and strategy, it being their primary focus. Another 2 years thereafter, I now work totally in the area of applied research and yes primary research is part of the portfolio except that my role usually demands that I analyze the data post its collection. The shortcuts in data collection and instrument design continue to hassle me despite sometimes the ease of electronic means of data capture.

I recently got an opportunity to go back into the classroom, a new set of students, a fresh list of projects, a new chance once again to teach research methodology. This time I choose projects targeted at 18-25 age category with projects revolving around issues that concern them directly. I am also armed with google surveys, so students don’t have to go out in the field, they can simply whatsapp the link to potential respondents, all of whom have smartphones. The target respondent group was meant to be urban, educated and media savvy so that they would not have any problem filling up the questionnaires at all. We also made sure that the questionnaire would take a maximum of 15 minutes even if someone took their time, and demographic questions were placed in the last and were minimal. The language was simple, direct and no logic added to questions to make it clutter free. I purposely did not mandate email addresses as a compulsory question, simply because I want to practice anonymity and protection of respondents’s privacy as a part of research ethics since those email addresses would be available to students and many of the respondents could be female as well. The targets are still the same, a group of 4 students, a target of 100 responses. Yes of course I get responses, the required number, but I also learn that each student auto-filled multiple questionnaires, each of them doing around 8-10 questionnaires. Some responses are no doubt genuine and one will figure out a way to remove duplicate responses in the questionnaires so maybe I can still derive some research insights but not enough to achieve a research paper status.

Yes the students will complete their projects, will learn a bit about framing a research question and testing a hypothesis… What they have still not learnt is research ethics and ethics in data collection. These are the same students who will one day be corporate professionals and will be responsible for making decisions based on research …