Artificial Intelligence (AI) has already significantly impacted various aspects of our lives, sparking discussions about its benefits and the need for regulation. Whether it’s fashion or any other industry, the influence of AI is a topic of ongoing debate. AI is not yet as smart as the human brain but it sure does mirror us, albeit without emotions.
We interviewed distinguished AI researcher and senior data scientist Sumedha Rai, based in New York, to delve into her journey in AI and discuss the latest developments, regulations, and future trajectory of the field.
With extensive experience spanning multiple sectors including technology, finance, medicine, and SaaS, Rai’s primary focus lies in Natural Language Processing (NLP), a field of AI that has gained prominence with advancements such as ChatGPT. NLP enables human-machine interaction through language, allowing machines to comprehend nuances in communication, including sentiments, overarching themes, variations in word usage, and context-dependent meanings.
Rai’s work revolves around deciphering user sentiments and categorizations, generating compelling conversation topics, soliciting opinions on contemporary issues, and translating manual notes into actionable insights for software. She thinks the world of NLP and AI offers immense possibilities and finds great joy in exploring its potential.
Tell me a little bit about your background. Where did you grow up, where did you go to college, and what did you study? What motivated you to pursue a career in AI/ NLP, and how did you get started?
I grew up and studied across India, the UK, and the US. I was a big computer science enthusiast in high school and was almost greedy to learn anything related to technology. During my Bachelor’s, however, I pivoted to majoring in Economics and subsequently studied Finance during my first Master’s. Coding took a back seat, but I was learning statistics and math, and anything analytical captured and held my interest. I spent a few years in investment banking and private equity, and it was then that I realized that I was constantly trying to connect and evaluate tech companies for investments because I was so enthralled about learning the tech behind the product.
During evaluation calls, I asked more technical questions rather than focusing on business-related metrics. Two years later, I quit and pivoted towards data science in the policy sector. I aimed to ride the rising tide of machine learning (ML) while still retaining the comfort of being enveloped in economics and finance.
After spending some time in the UK, I worked at the Central Bank of India, where I gained valuable experience. During my time there, I explored how learning ML and AI could integrate into a variety of use cases with the right data and mathematical models. Following my stint in India, I began my graduate studies at New York University, which opened up a whole new world of AI for me!
During my time at NYU, I had the opportunity to learn about deep learning from Dr. Yann LeCun, the founding father of Convolutional Neural Nets. It was a surreal experience, and I discovered something fundamental and beautiful: anything can be considered data if you know how to use it! This realization was transformative and expanded my understanding of the possibilities within the field of AI.
Natural Language Processing is one such field where, for the longest time, we haven’t fully harnessed the potential of textual data or grasped the enormous volume of data available. I appreciate the fact that our curriculum included courses on the ethics of machine learning. These courses were a constant reminder to exercise caution and mitigate the potential harm biased data can cause.
The first research paper I read on a pre-trained transformer like BERT– a machine learning framework that Google developed in 2018 to improve NLP– brought me immense joy; it was so eloquently crafted. From that moment on, I actively pursued projects with text data. When I began furthering my career in AI, I strongly advocated for ingesting diverse types of text data to develop actionable insights for the business.
Sometimes, a straightforward model revealing customers’ overall sentiment can significantly influence the future strategy for a newly released feature. This insight can save a considerable amount of time and money. While I am engaged in various machine learning projects, my involvement in Natural Language Processing (NLP) projects has steadily increased and proven successful in deployment. I am also actively engaged in academic research in the field of NLP.
Tell me about some of your biggest accomplishments/awards/projects you’re most proud of.
My passion for coding goes back to high school. I graduated from high school in India, where I formally learned coding for four years in two different languages as part of my curriculum. My dedication paid off when I was ranked number one nationwide among approximately 100,000 students who took the final exams each year. Achieving the “All India Topper” title was a personal triumph I hold dear and take great pride in.
Upon entering my first master’s program, I was awarded a full scholarship, a recognition granted to only two individuals in the university. While academic achievements are significant and reflect years of hard work, professional success also requires consistent effort and dedication.
One of my proudest projects was building a model for a bank that predicted the probability of default – the likelihood that a borrower will fail to meet their debt obligations. The dataset for this endeavor comprised a mix of structured and unstructured data, including numeric and text values. Some of the data was proprietary, and some of it was publicly web-scraped. The sheer volume of data I processed for this prediction was staggering, and it became one of my longest-running projects. The bank where I implemented this model successfully deployed it for two years before we updated it to incorporate newer techniques.
One of my other proudest achievements is in the field of fraud modeling, a field I actively create models in and also speak about. Fraud prevention is an integral component of every financial institution’s operations. Leveraging machine learning, we can proactively mitigate fraudulent activities by forecasting the likelihood of fraudulent events and taking preemptive measures. This initiative comprises a suite of specialized models tailored to combat various forms of fraud, which we identify as pivotal to safeguarding our organization. My overarching predictive modeling endeavors generally encompass a robust array of advanced fraud prevention techniques, fortifying defenses against threats such as account takeovers, transaction fraud, payment fraud, and similar malicious activities.
What are some real-world applications of NLP that have had a significant impact on various industries?
Large language models are currently at the forefront of discussions, and ChatGPT, rooted in Natural Language Processing (NLP), is creating significant waves. It is utilized daily by millions of people across various industries, highlighting the impact of leveraging vast amounts of data with the right model architecture. This technology demonstrates an incredible ability to understand context like a human and deliver real-time responses that are grammatically precise and highly relevant. Remarkably, all of this is achieved by a trained machine.
As an NLP researcher, I possess an understanding of the underlying building blocks of this model architecture. Yet, despite my expertise, I am continually amazed by its capabilities. The fact that a machine can exhibit such sophisticated language skills never fails to impress me.
I’ve gained substantial experience in the fintech and healthcare domains. The automation of tasks, such as analyzing financial and legal documents and categorizing them into relevant sections, has proven to be a remarkable time-saver. Many banks have streamlined these processes, leading to significant efficiency improvements in upstream and downstream operations.
Furthermore, sentiment analysis based on comments, surveys, and social media data following crucial market announcements allows for a near-real-time understanding of public reception. These insights are invaluable for financial firms as they enable swift assessment of public sentiment and facilitate data-driven decisions.
NLP has found extensive application in medical document processing and analysis in healthcare. I do research with OLAB, a healthcare lab at NYU that uses ML-guided investigations to augment clinical work in treating neurological disorders and cancer. One notable project, NYUTron, developed by the research team at OLAB, has created a health system that utilizes unstructured clinical notes from physicians. This system predicts crucial metrics like patient readmission rates, length of stay, insurance denial, and even in-hospital mortality. These predictions are pivotal in healthcare, where accuracy is a matter of life and death, potentially saving lives through timely interventions.
What are NLP’s current challenges and limitations, and how is the field addressing them?
Since NLP is a language-specific technology, its challenges are similar to those encountered when teaching a new language to a beginner. Grasping the alphabet and mastering grammar represents just one aspect. Additionally, we use the same words to convey different meanings based on their context of usage. This nuanced understanding, referred to as ‘context,’ has seen significant progress in training our machines by exposing them to numerous language examples.
However, difficulties occur when attempting to convey sentiment through the ‘tone’ of our voice. Sarcasm and a straightforward sentence might employ the same words, but their delivery distinguishes their intent. What may seem positive to a machine could carry a negative connotation. Teaching a machine something we often take for granted—the concept of ‘common sense’—also poses challenges. Moreover, the ever-changing landscape of slang requires constant model updates to keep pace with this informal aspect of language.
Another challenge I have encountered relates to industry-specific language, often called industry jargon. An NLP model well-suited for healthcare may not yield accurate results when applied to financial or legal documents. Researchers are continuously involved in training industry-specific models to enhance the accuracy and usefulness of this technology across various fields. However, challenges persist in highly specialized fields or areas lacking sufficient data to train models for optimal performance.
The lack of digital data for some low-resource languages poses a significant hurdle. Due to limitations in data resources, training NLP models for these languages proves difficult.
Are there ethical concerns or biases associated with NLP, and how can they be mitigated?
This question is one of the most important questions discussed today. It is also a question that every researcher should consider when comprehending data for their models. In simple terms, a biased dataset used to train a model will inevitably lead to biased results in the model.
Harvard Business Review highlighted the detrimental effects of biases in NLP, as they could potentially hinder individuals’ opportunities and economic participation. As an example of gender bias in models, Amazon’s previous resume-filtering algorithm displayed a notable preference for words like “executed” or “captured,” which were more commonly used by male applicants.
However, when I asked ChatGPT a gender-biased question such as “What professions do men generally have?” it responded with, “Professions are not inherently gender-specific, and one’s gender should not limit the choice of profession. In many parts of the world, there has been a move towards breaking down traditional gender roles, and individuals of all genders pursue a wide range of careers.” This demonstrates that the model can adjust and reduce biases if the underlying data is updated to reflect a more contemporary understanding of our society.
I cannot stress enough how careful we must be about biases when we train our machine-learning models. Biased or unbalanced data ingestion can lead to results that may discriminate based on race, religion, or demographics.
For example, imagine you want to build a model to predict the most common professions of people from Columbia University. However, if you only collect data from Columbia alumni who became journalists because you know a lot of journalists from Columbia, your model might incorrectly conclude that most Columbia graduates pursue journalism careers. This is a classic case of selection bias because your sample, in this case, only includes a specific group of people (journalists), and it doesn’t accurately represent the diverse range of professions that Columbia graduates actually pursue. As a result, your model’s predictions about the most common professions would be biased and inaccurate due to the limited and unrepresentative sample.
Now imagine if these biases were related to sensitive matters such as crime, social issues, and other public concerns– they could have significant and undesirable repercussions. To mitigate these concerns, rigorous anti-bias checks should be integral to data preparation for NLP models. Regular updates and incorporating new and balanced data from credible sources can enhance the robustness of the models and minimize the risk of discriminatory predictions. Additionally, I believe that for critical issues, the utilization of NLP models should be coupled with impartial human review of the results to guarantee fairness and accuracy.
Some famous authors have sued Open AI for copyright infringement, alleging illegal use of copyrighted material to train AI models. If the plaintiffs win this case, how could this case reshape AI?
I wouldn’t say this is surprising. Training a large language model requires a huge amount of data, and some authors or content creators may be uncomfortable that this training was conducted without their explicit consent or involved data repositories containing copyrighted information. I’m uncertain about the feasibility of obtaining approvals for such an extensive volume of data used in model training.
To my understanding, organizations such as OpenAI and other prominent developers are allocating legal funds and resources to tackle such situations. This approach offers mutual benefits and serves as a means of collaboration with writers.
Such cases may become more common as large language models continue to be developed for various applications. While data available in the public domain can be incorporated without legal issues, we may witness lawsuits arising from other concerns. However, I don’t think this will significantly hinder AI’s almost unfettered pace of experimentation and innovation. Alternative methods and agreements will probably surface to make this process advantageous for both parties involved.
Some people are concerned that AI is going to replace jobs. Others say that employees who know how to utilize AI will stand out in their fields. What do you think?
Personally, I lean towards the second viewpoint with a slight modification. I believe companies and individuals who have embraced AI and kept pace with the rapidly evolving and potent technology are more likely to thrive in the future.
There was a time when computers were a novel innovation, and some individuals hesitated to adopt them due to trust issues surrounding their use to store and process sensitive information. Today, it’s almost impossible for a business to remain entirely offline.
We are moving in the same direction regarding AI. Early adoption and education will be crucial for success in certain industries. Furthermore, individuals with expertise in using AI will possess a valuable skill set that will become increasingly important in the hiring process. In the realm of economics, the traditional labor-versus-capital paradigm is evolving. While some manual roles may become automated, this will also enable the workforce to focus on more innovation-driven positions that drive technological progress.
When it comes to innovation, nothing can truly compare to the human mind. So, while there might be a reconfiguration of roles, I am not in favor of being overly apprehensive about such a powerful technology. History has shown that similar narratives played out when computers were introduced, yet they ultimately brought significant positive changes.
What are your thoughts on fashion AI?
When people think about AI, the terms that immediately come to mind might not include ‘fashion.’ I’ve noticed discussions about AI in medicine, marketing, and algorithms, but there is less emphasis on fashion and AI, despite my belief that this industry was an early adopter of AI technology.
We’ve seen virtual ‘try-ons’ for clothing and accessories, visual searches for products, and lookalikes based on a user’s picture. Recently, AI has been used for sustainable fashion, inventory management, and supply chain optimization. An interesting take on fashion and AI that I read recently was trying to predict the different fabrics used within a clothing line and automate the labeling process of garments. Like medicine and finance, I think AI will change how we embrace “fashion” and adopt it in the future.
In the field of NLP, I’m excited about the prospect of fashion utilizing text prediction analysis to assess recent trends and gauge user sentiments towards them. This technology could extend its reach to a broader audience through multilingual support within brands, particularly in content creation, customer support, and feedback collection. Additionally, this data could enhance personalized user recommendations and overall user experience.

Do you think AI can replace fashion models or actors in the future?
This question is exciting because, from my perspective, the fashion industry has been at the forefront of embracing AI. An excellent example is the Metaverse Fashion Week held in March 2023, which was already the second time this event was hosted. It was fascinating to see that some of the biggest names in the fashion industry participated, and the event was truly remarkable. It seamlessly blended the digital and real worlds, creating a lifelike experience. The avatars, runway shows, networking spaces, and, oh, the dresses themselves were all so impressive. It’s a testament to the fashion industry’s innovative approach to integrating AI and technology to create captivating and immersive experiences.
Despite the integration of AI, I remain unconvinced that it can entirely replace fashion models or actors. In these industries, audience relatability is paramount. The genuine glow of a skincare product on a human model, the confidence exuded by a woman strutting down the runway in a flattering dress, or the emotion conveyed by an actor to their audience is challenging to replicate digitally. These experiences are relatable because they allow you to envision yourself in their shoes. To put it somewhat poetically, I firmly believe that the irreplaceable and intangible human essence that connects the fashion and entertainment industries with their consumers cannot be mimicked. While the allure of integrating this technology is undeniable, and these industries are known for their adaptability to new trends, I foresee a journey of integration and enhancement rather than outright replacement.
While integration is positive, I believe it must be undertaken with an ethical foundation, keeping in mind the well-being of employees. Implementing changes in any industry should be done in a regulated way and at a measured pace.
What laws and regulations need to be in place as AI further develops?
It’s interesting that you’ve raised this question because I recently wrote an article discussing the need for AI regulations in various countries. One of the biggest concerns has been data privacy, given some AI models’ extensive use of personal data. Individuals should have the right to determine precisely how much data they are comfortable sharing. If that choice makes certain services unavailable to them, that’s fine, too. However, the essential element is the availability of that choice.
In many instances, I’ve encountered people willing to disclose information if it leads to enhanced convenience, while others are more cautious. This ‘right to choose to disclose’ is, in my opinion, very important. It is problematic that many companies present lengthy and convoluted agreements about data disclosures, often leaving consumers unsure about what they consent to disclose. This practice troubles me, even as someone who recognizes the importance of data for building effective AI models.
The European Union (EU) has been notably proactive in safeguarding the data rights of its citizens. The General Data Protection Regulation (GDPR) is one of the most rigorous data privacy laws in existence. It not only mandates compliance for organizations within the EU but also applies to any company, regardless of location, if it handles data related to EU citizens. Several other countries are developing and implementing data privacy regulations, while some have minimal regulations.
With the White House’s announcement of the Executive Order on AI, we are much closer to having comprehensive federal data and AI laws in the United States. This development is significant, as the absence of federal frameworks leads to individual states enacting their own privacy laws. Such a scenario could complicate the operations of AI businesses working across state lines in the future, as compliance requirements would vary from one state to another.
What advice do you have for individuals who want to pursue a career in NLP/AI or enhance their skills? Are there specific resources, courses, or tools you recommend for learning NLP/AI?
Drawing from my personal experience, adopting a structured approach greatly helped my understanding of the fundamentals of AI. Before pursuing my Master’s in Data Science at NYU, I had some experience in data science. However, during my Master’s, I dug deep into foundational subjects like linear algebra, probability and statistics, data structures, and algorithms. I also learned how to write cleaner, more efficient code, and aligned with the best practices of coding standards. These skills serve as the cornerstone for machine learning and AI, and I firmly believe that taking shortcuts in learning AI would prove detrimental over time.
While using pre-built libraries can expedite the coding process, it’s very important to understand the underlying principles. You must be able to diagnose technical issues at the very source. If a model yields unsatisfactory results, it’s essential to question its fundamental assumptions. Similarly, to convert a problem statement to a predictive model, you must be able to conceptualize how to quantify it and prepare it for machine processing. Therefore, if you haven’t yet familiarized yourself with the basics of mathematics, statistics, and computer science required for ML, I recommend seeking structured online courses to acquire these skills. YouTube channels like “3brown1blue” offer good visualizations of linear algebra concepts, and it helps to learn from visual representations of complex ideas.
Additionally, acquire real-world datasets for practice in model development. Platforms such as Kaggle provide problem statements and datasets, and numerous other websites offer free datasets for experimentation.
For those interested in NLP, I remember reading the research paper that initiated the wave of transformer models, titled “Attention Is All You Need” by Vaswani et al. I strongly recommend this paper for those starting in the field. Jay Alammar’s blog is another resource known for its intuitive explanations and helpful illustrations. You can also benefit from video lectures and courses by renowned experts in the AI field.
Dr. Andrew Ng offers courses on Coursera and other online learning websites (even YouTube – DeepLearningAI). Professor Canziani at NYU strongly advocates free education and has shared YouTube videos and lecture notes on Deep Learning on his website. Many universities also release lecture notes and videos for official courses taught online.
If you can’t find code notebooks for practice, I recommend seeking out coding examples or notebooks that contain actual code for NLP or AI problems, understanding them, and then attempting to replicate the process with a different dataset or another use case. Learning from your coding mistakes is key to understanding and applying these concepts to real-world applications.
***
If you’re interested in further reading on AI, here are some published articles by Sumedha Rai:
Sumedha Rai’s Op-Ed on “the Race for AI” in The Business World.
Comparing Large Language Models
You can follow Sumedha on LinkedIn
Or visit her website