The Hidden Biases in Big Data

fastcompany:

“Data are assumed to accurately reflect the social world, but there are significant gaps, with little or no signal coming from particular communities.”

Any psychological study that is non-invasive or doesn’t take a lot of time is probably done with a sample size of university students on the east or west coast. Many neurological or medical studies that are non or minimally invasive (such as swabbing saliva) are as well.

A lot of medical and human biology studies site a requirement of “healthy individual” meaning that they exclude anyone with a chronic illness. They also include people on medication for almost any reason in most studies. A huge amount of studies will require women to NOT be taking hormonal birth control of any kind, meaning a ton of women are excluded, added bias.

Most all drug trials are done on disadvantaged populations. A lot of them advertise (directly contrary to bioethics) as a potential treatment, using just the right wording to pass ERBs, and then framing it in such a way as to invite people to come as a possible treatment. This is especially true of drugs being tested for addition and mental illness.

Most drugs, in total, that cannot be tested as above, are tested in poor countries where there is not access to health care, also used as the only means of treatment. Basically all malarial treatments are field tested. (Though all of these do go through animal trial first, so they are pretty sure they aren’t out right poison.)

On a less morbid note, far too many surveys are still done via land lines. LAND LINES. And during the day. Meaning they sample old people. And that’s it. For a very long time, homeless people have been left out of these, and now it also leaves out youth.

Internet surveys obviously are bias towards people who spend more time online, and depending on where the poll is and who links to it, you get huge issues. Just think about how many links you have seen on tumblr telling you to skew a poll.


(I haven’t read the article yet.)

(Source: rickross10)

  1. ubernutella reblogged this from ahandsomestark
  2. fenixed reblogged this from teachingliteracy
  3. ambitionatsubzero reblogged this from fastcompany
  4. mcbitchtits reblogged this from teachingliteracy
  5. askmenomore reblogged this from teachingliteracy
  6. rupalip reblogged this from fastcompany
  7. sarcasticnerd reblogged this from teachingliteracy
  8. journalistic-computing reblogged this from fastcompany and added:
    Interesting piece on bias in data. -Annaliese
  9. allacharade reblogged this from teachingliteracy and added:
    Any psychological study that is non-invasive or doesn’t take a lot of time is probably done with a sample size of...
  10. sandunicorn reblogged this from teachingliteracy
  11. stochasticvariable reblogged this from teachingliteracy
  12. polishingmirrors reblogged this from teachingliteracy
  13. daisykwan reblogged this from fastcompany
  14. lolgatormew reblogged this from teachingliteracy
  15. sosungalittleclodofclay reblogged this from teachingliteracy
  16. il-y-a-toujours-demain reblogged this from teachingliteracy
  17. iamicecreamsbitch reblogged this from teachingliteracy
  18. bubonickitten reblogged this from teachingliteracy
  19. ghost-lighting reblogged this from teachingliteracy
  20. teachingliteracy reblogged this from fastcompany
  21. ostrichkim reblogged this from fastcompany