Machine Learning: The Catalyst of a Reproducibility Crisis in Science

0

 Machine Learning: The Catalyst of a Reproducibility Crisis in Science



Science and technology has taken us from the stone age to the space age; from the horse-drawn carriage to self-driving cars. But this rapid development has come at a cost. As the number of scientific papers published increases each year, it becomes harder and harder to tell good science from bad science, or even fake science from real science. This situation is not just problematic because it makes it difficult to tell whether our scientific research as a whole is trustworthy; it also has adverse effects on research itself, as reproducing research results becomes more and more challenging.


Reproducibility is not new

The concept of reproducibility is not new to science. In fact, it is one of the foundations that the scientific method is built upon. If an experiment cannot be reproduced, then it is not considered scientific. This has been the case for centuries. However, with the advent of machine learning, there is now a new reproducibility crisis in science. A team at Stanford recently published a paper entitled Why we can’t trust results from small data sets. They found that computer-generated fake images were often indistinguishable from real images generated by humans. Another study found out that two out of three artificial intelligence algorithms were falsely labeled as borderline or malignant when they had correctly classified them as benign more than 80% of the time. With these and other findings in mind, there are four major questions that need to be answered about the use of AI in science: 

1) How do you know if your algorithm is generating correct results? 

2) How do you know if your algorithm is predictive? 

3) How do you know if your algorithm has any bias? 

4) What will happen when AI takes over all tasks previously performed by scientists? These questions should be asked before relying on data produced by machines. Machine learning may solve many problems in science but there needs to be a balance between the benefits and risks of using this technology. Scientists must question how reliable their methods are and how much influence bias plays in their research. There needs to be balance between the benefit and risk of using machine learning in science. Scientists must question the reliability of their methods and how much influence bias plays in their research. Machine learning is solving many problems in science but there needs to be balance between the benefits and risks of using this technology. Scientists must question how reliable their methods are and how much influence bias plays in their research. There needs to be balance between the benefit and risk of using machine learning in science. Scientists must question the reliability of their methods and how much influence bias plays in their research. It is important to remember that some biases have no connection whatsoever to machine learning. For example, scientists might unintentionally discard certain datasets because they don't align with the previous work done in the field (a form of selection bias). Other times, groups might come up with arbitrary rules like rejecting anything under 0.05 (p-value) which could lead to circular reasoning and groupthink. And finally, people could accidentally reinforce their own beliefs by only looking for evidence that confirms what they already think instead of trying to disprove their hypotheses. Bias can exist even without machine learning being involved so it's important to investigate potential bias sources besides just the reliance on AI. One way that researchers can minimize possible bias is by conducting blinded experiments. When someone conducts a blinded experiment, the researcher does not know which participants received the intervention while others didn't until after they've analyzed their data. That way, even though the person knows what outcome they're looking for, it reduces chances of selecting experimental results based on preconceived notions. Along the same lines, studies should also be independently replicated by different teams to ensure that they find the same results. Some of the most well-known examples of replication failure include Dr. John Ioannidis' 2005 article in PLOS Medicine, reporting that around half of highly cited original research is either wrong or exaggerated; and a 2016 JAMA meta-analysis concluding that only 11% of 67 high impact medical interventions were successfully reproduced.


Cutting corners leads to flawed results

In an effort to achieve ground-breaking results, some scientists are cutting corners. They're not taking the time to properly document their methods, or they're fudging the data to fit their hypotheses. This might get them published in the short-term, but it's not sustainable in the long run. Sooner or later, their flawed results will be exposed, and the scientific community will lose faith in their work. It doesn't have to be this way! Some scientists are making sure that the algorithms driving their experiments take care of those pesky details for them. Algorithms can make calculations much more efficiently than humans can; they don't need sleep or food breaks, and they never make mistakes because they don't feel emotion like we do. These robots, who are also known as machine learning (ML) systems, can predict how chemicals react with each other better than any human scientist could. And even if these ML systems made some errors at first, they would quickly improve on themselves as they continued to learn from their mistakes. No wonder there is so much excitement about ML lately--it promises to fix all of science's problems! But is it too good to be true? There are lots of challenges to implementing ML into research labs. First, people worry that machines will replace the jobs of scientists and researchers. What happens when everyone retires? How many people must graduate with STEM degrees every year in order to keep up with demand? Second, until now, one thing humans always had over machines was creativity--we were able to think outside the box and come up with new ideas where machines couldn't. If machines become creative enough then maybe that won't matter anymore. Thirdly, while machines may be able to do complicated math very quickly, people still perform best when it comes to interpreting information. If a person is already struggling then adding a new task can make things worse rather than better. And finally, if computers start doing everything, what happens to us? Machines aren't perfect either and sometimes break down unexpectedly. Can you imagine what would happen if our most precious resources went offline for days or weeks at a time? Let's not forget that human error has been shown to account for 98% of software bugs, and most bugs go undetected. Imagine what could happen without our smarty pants machines around to help out! The United States faces a shortage of close to 500,000 qualified scientists and engineers. Scientists rely heavily on computing power and automation to process large amounts of data, explore potential solutions, and validate their conclusions. When asked which of the following they use more often in their work, almost half of respondents choose computing power. This leads to the conclusion that modern science is highly dependent on automation, which only makes sense since machines are faster and less biased than humans. 

Some skeptics say AI (artificial intelligence) is dangerous because it replaces all aspects of society, including our intelligence. The last few decades have seen a dramatic increase in the number of people reliant on machines. We rely on them to be our doctors, chefs, and personal assistants. Robots are even replacing teachers in some schools! And the list goes on and on. The concern is that eventually, humans will be replaced by machines altogether. All of our human skills will no longer be needed--nothing but empty shells with nothing to do except push buttons and swipe screens all day long. This idea isn't all bad though! Some argue that if humans don't work for corporations, they'll spend more time fulfilling their creative urges and creating art, music, or films for the public to enjoy. If humans aren't working, they won't need to eat as much, so there's less food wasted. Machines could make clothes without harming animals. In general, life would be easier without all the stress of jobs. But there's one thing that worries me about this scenario--what happens when we can't figure out how things work anymore? If we're relying solely on machines to do everything for us, what happens when one machine starts acting up? That might sound like a strange worry but think about it this way: The rise of automation means fewer and fewer people know how things actually work. Fewer scientists, technicians, engineers and experts who understand how artificial intelligence works. Fewer people with hands-on experience who understand these systems from the inside-out. As AI becomes ubiquitous in everyday life, an increasingly small group of experts are left responsible for teaching humanity how these systems work--even though many of these experts themselves struggle to stay abreast of new developments. Think about something as simple as an ATM machine; years ago you had a person working behind the scenes taking your money and handing you your card back before you left. Nowadays an ATM has nobody sitting at it--nobody manning the front desk so to speak--but they're still functioning perfectly well thanks to modern programming techniques that can mimic human thought processes using computer code.


Do we have the infrastructure for repeatable experiments?

No, we do not have the infrastructure for repeatable experiments. In order to really take advantage of machine learning, we need to be able to have access to data that is clean, well-labeled, and consistent. We also need to be able to track the provenance of that data so that we can understand how it was generated and whether or not it has been tampered with. Unfortunately, most scientific data is not well-organized or well-documented, which makes it difficult to use for machine learning. Additionally, the current reproducibility crisis in science means that many experiments cannot be repeated or verified, which further complicates matters. Even if an experiment were reproducible, there are many cases where there is no true ground truth about what the correct result should be; for example, when scientists are looking at cell growth rates under different conditions. If there’s no ground truth to compare results against, then how can you train a machine? So far, the answer appears to be to measure as much as possible and then feed all of that information into deep neural networks. Machine learning algorithms will learn from those measurements and try to come up with patterns or rules on their own. 

There's some evidence this works well for things like image recognition systems because they have humans telling them what's right or wrong before they're turned loose on any input data. But without a human checking every answer, this method isn't ideal for tasks like determining cancer types. It would require too much manual intervention for machines to learn reliably on their own. That said, there is some hope for the future - new tools are being developed that may allow us to automate parts of experimentation using machine learning techniques (more here). What does this mean for your research? One answer might be automated high-throughput screening methods and robots performing biological assays - but these techniques aren't currently practical due to expensive equipment costs and lack of available software development frameworks (read more). Until better solutions are found, researchers will continue to focus on designing careful, controlled experiments that produce clear and unambiguous answers. Although this process takes longer and requires more effort than running computer simulations, it seems worth the time and effort to get quality data instead of wasting time on unreliable datasets. Still, we know that even under perfect circumstances, accurate predictions are hard to come by. Humans make mistakes too, even though they're less likely than computers.

It's worth asking what will happen if we start making important decisions based solely on machine-learning predictions. Will we find ourselves blindly following prescriptions that haven't been rigorously tested? How reliable are these systems compared to traditional statistical methods? This is a complicated question. To date, the best machine learning models are only slightly better than traditional statistical models in certain scenarios. However, the promise of machine learning is that it can be trained to perform any task and is not limited to a particular type of statistical modeling. For this reason, we think that it's unlikely for anyone to give up on traditional statistics anytime soon. While this debate continues, one thing is certain: both approaches have advantages and disadvantages, and combining the two could be a good solution. This blog post will provide more information on how this combination could work in the near future. More specifically, we will cover a recent paper that describes how machine learning and traditional statistical methods can be combined to improve outcomes. Traditional statistical methods, such as the ANOVA test, are often used to determine whether there is a significant difference between two or more groups. These tests can be run on groups of people with and without a disease, for example, and it will tell you if the disease has any effect on the data. Machine learning can then be used to predict which group will have a higher response rate (or other outcome). This approach is often referred to as human in the loop machine learning - and it's not without its challenges. For instance, predicting individual responses could prove challenging for some medical treatments since there are multiple unknown variables that affect how each person responds to treatment. Additionally, it's difficult to collect data for a large enough sample size to train the machine learning model on. Thus, there are many factors that can't be accounted for and lead to inaccuracies in this approach. However, machine learning is still useful for collecting additional features from clinical trials that can be analyzed with traditional statistical methods. In this way, they can be complementary and provide additional insights into how different treatments affect patients. For example, machine learning could be used to generate a set of possible diagnoses for the patient. Then, human experts can go through and decide which diagnosis is most appropriate. This allows the machine to do some of the heavy lifting while leaving human experts to apply their knowledge and judgment where it's needed most. It also helps fill in gaps where data is unavailable or not informative. For example, if there's no data on how a certain medication affects an ethnic minority population, machine learning could generate predictions based on data from other populations that are similar. This could help provide insight into how different ethnic groups react to certain medications - or at least indicate where more testing is needed. With machine learning, we can now make much faster progress because the computer doesn't get tired like humans do. It's less expensive and doesn't require as much manpower to conduct experiments; it also frees researchers up to focus on improving the models rather than spending time managing data sets. And lastly, unlike human researchers who make mistakes when conducting experiments, machines don't need coffee breaks! But machine learning can't replace traditional statistical methods either. There are a number of limitations to machine learning, including the need for data, accuracy, and reproducibility. Data is a big issue. As mentioned earlier, it's hard to collect data for a large enough sample size to accurately train the machine learning model on. But even if we can create data for training purposes, we can't be sure that the predictions made by the machine will be accurate in real life situations due to unforeseen factors.


Are people leaving academia due to this problem?

It's no secret that the number of people leaving academia is on the rise. In fact, a recent study found that nearly 60% of academics are considering leaving their jobs. And while there are many reasons for this, one of the most frequently cited is the reproducibility crisis in science. Machine learning is often lauded as a tool that can help solve this problem, but unfortunately, it's also part of the reason why the crisis exists in the first place. If you're not familiar with machine learning, think of it as software that uses algorithms to find patterns and correlations in data and use those to make predictions about future data. Essentially, what once required years of work by teams of humans now only takes hours or days to do. So what does all this have to do with the reproducibility crisis? Well, when researchers spend less time doing tedious manual labor, they have more time to take their ideas and turn them into valuable products. Machine learning enables these products because instead of trying every possible combination of variables themselves, they can let a computer figure out which combinations are best. However, because machine learning requires so much less human input than manual methods did before its arrival on the scene, some scientists question whether it’s possible for even skilled scientists to be able to understand the outputs created by computers. For example, if a scientist builds an algorithm designed to predict the average length of words based on how often each letter appears in English texts (i.e., machine is likely shorter than photographer), how would she know if her predictions were accurate? She would need to try running her algorithm against other texts, and hope that her results matched up against reality. But for larger problems such as predicting long-term weather trends or predicting who will commit a crime next year, it would be impossible for any single person to create an algorithm from scratch without relying heavily on machine learning--or at least without leaning heavily on artificial intelligence (AI). That brings us back to the reproducibility crisis, which becomes more pronounced as machine learning becomes ubiquitous. What was once a qualitative prediction made by hand becomes quantitative, replicable output generated by a robot. While we don't know how deep the impact of AI will be on society yet, it seems clear that machines cannot replicate certain skills like creativity and experience; those belong uniquely to humans. Some people believe that the upcoming Fourth Industrial Revolution could mean a brighter future for humanity, but others worry that our reliance on machines could eventually lead to job loss. We don't know what the future holds, but it's important to recognize that just because we've solved one problem doesn't mean another won't crop up. The same thing happened during the industrial revolution when textile workers lost their jobs as machinery took over--only to then become unemployed farm laborers after the US Civil War eliminated demand for food production. 

Ultimately, whether or not machine learning is truly the catalyst of a reproducibility crisis in science remains to be seen. But one thing is clear: our society relies too heavily on automation already, and this trend needs to change soon.


Lack of communication as an obstacle to reproducibility

In order for reproducibility to be possible, scientists must be able to communicate their findings clearly and concisely. Unfortunately, this is not always the case. Scientists are often so specialized in their field that they forget to explain their work in layman's terms. This can make it difficult for other scientists to understand and replicate their work. Additionally, the language of science is constantly evolving, making it hard for even experts to keep up. As a result, important discoveries can be lost in translation, preventing them from being reproduced. For example, one study found that of 1,000 papers published on deep learning between 2012-2016 only six had been reproducible. Lack of transparency as an obstacle to reproducibility: There are many cases where publishing papers isn't enough - more data and details need to be shared with the scientific community at large if there is any hope for others to reproduce results. With closed-source software programs, like some machine learning algorithms, it can be impossible for anyone but the original developer(s) to see how the program works and what inputs lead to what outputs. As with lack of communication as an obstacle to reproducibility, lack of transparency makes things difficult because other researchers cannot build on someone else's research without understanding how they got there first. Furthermore, just because someone has the right tools to try and do something doesn't mean they know how. So without access to open source code or data sets, it can be next to impossible for people who don't have a PhD in statistics or mathematics to carry out reproductions of experiments. In a 2016 Nature commentary on reproducibility, Craig Silverman lays out some methods journalists could use to help encourage transparency among those who conduct studies in fields like psychology, neuroscience and genetics. For example, he suggests using Freedom of Information Act requests (FOIA) when available or calling for public release before publication when not covered by FOIA law. These efforts would help create a system of checks and balances to stop bad actors from manipulating the scientific process, which is important because science relies on trust. 

Some types of reporting may also be helpful in incentivizing disclosure through prizes and/or awards such as Deming Prizes. Another suggestion was having journals publish error bars alongside statistical significance values; these errors bars would show readers the range within which most similar studies' effects lie, thereby providing some indication about whether results seem too good to be true based on previous findings. Lastly, labs should establish clear protocols for how and when lab members should take measurements (i.e., include detailed step-by-step instructions). Doing so will reduce variability in the reproducibility of findings and provide a set of instructions that can be easily followe.


Tags

Post a Comment

0 Comments
* Please Don't Spam Here. All the Comments are Reviewed by Admin.
Post a Comment (0)
To Top