<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


<meta name="Generator" content="Microsoft Word 15 (filtered medium)">


<!--[if gte mso 9]><xml>


<w:WordDocument>


<w:DontUseAdvancedTypographyReadingMail/>


<w:DontUseJustificationAdvancedTypographyReadingMail/>


<w:DontUseHyphenationAdvancedTypographyReadingMail/>


</w:WordDocument>


</xml><![endif]--><style><!--


/* Font Definitions */


@font-face


        {font-family:"Cambria Math";


        panose-1:2 4 5 3 5 4 6 3 2 4;}


@font-face


        {font-family:Aptos;}


/* Style Definitions */


p.MsoNormal, li.MsoNormal, div.MsoNormal


        {margin:0in;


        font-size:12.0pt;


        font-family:"Aptos",sans-serif;


        mso-ligatures:standardcontextual;}


span.EmailStyle20


        {mso-style-type:personal-reply;


        font-family:"Aptos",sans-serif;


        color:windowtext;}


.MsoChpDefault


        {mso-style-type:export-only;


        font-size:10.0pt;


        mso-ligatures:none;}


@page WordSection1


        {size:8.5in 11.0in;


        margin:1.0in 1.0in 1.0in 1.0in;}


div.WordSection1


        {page:WordSection1;}


--></style><!--[if gte mso 9]><xml>


<o:shapedefaults v:ext="edit" spidmax="1026" />


</xml><![endif]--><!--[if gte mso 9]><xml>


<o:shapelayout v:ext="edit">


<o:idmap v:ext="edit" data="1" />


</o:shapelayout></xml><![endif]-->


</head>


<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">


<div class="WordSection1">


<p class="MsoNormal"><span style="font-size:11.0pt">CS community,</span><span style="font-size:11.0pt"><o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt">We have the final iCORE workshop of Spring 2026 this week,  today at 4:00pm, iCORE (NRC2100) and online. See attached calendar invitation.<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt">As always, we will have iCORE updates at 3:30, and the workshop will start at 4pm, and you are welcome to attend either or both.  Both are in NRC 2100, but you can join virtually as well.<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p style="margin:0in"><span style="font-size:13.0pt;font-family:"Arial",sans-serif;color:black">Machine Learning Workshop: Imbalanced Data Strategies </span><o:p></o:p></p>


<p style="margin:0in"><span style="font-size:13.0pt;font-family:"Arial",sans-serif;color:black">Dr. Evan Krell


</span><o:p></o:p></p>


<p style="margin:0in"><span style="font-size:13.0pt;font-family:"Arial",sans-serif;color:black">Naval Research Lab - Marine Meteorology Division</span><o:p></o:p></p>


<p class="MsoNormal"><o:p> </o:p></p>


<p style="margin:0in"><span style="font-size:13.0pt;font-family:"Arial",sans-serif;color:black">A major challenge in machine learning is dealing with imbalanced datasets. That is, where the dataset does not have an equal number of samples from each class. In


 some situations, the dataset can be very imbalanced because of the actual rarity of certain events, so class balance can not be obtained through additional sampling. For example, to train a model for predicting hurricanes there are very few hurricane examples


 compared to a massive amount of non-hurricane examples. This makes it difficult to model extreme events. To deal with this, several class balancing methods have been proposed that have become very popular: Random Undersampling, Random Oversampling, SMOTE,


 and weighted loss functions. While they can be effective at improving skill at predicting the minority class, they have some major disadvantages. Mainly, the output probability loses any statistical meaning. In this workshop, I show that these data balancing


 methods might not do what you think they are doing and that there is another, simpler, alternative for tuning models under class imbalance that is applied after training the model so that the original probabilities that represent the training distribution


 are maintained. </span><o:p></o:p></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<div>


<p class="MsoNormal"><span style="font-size:11.0pt;mso-ligatures:none">Scott King<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;mso-ligatures:none">Professor of Computer Science<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;mso-ligatures:none">Director, iCORE<o:p></o:p></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;mso-ligatures:none">Texas A&M University – Corpus Christi<o:p></o:p></span></p>


</div>


<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>


<p class="MsoNormal"><o:p> </o:p></p>


</div>


</body>


</html>