<node id="667042">
  <nid>667042</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1680550343</created>
  <changed>1680550343</changed>
  <title><![CDATA[PhD Defense by Simiao Zuo]]></title>
  <body><![CDATA[<p><span><span><span><span><span><span>You are cordially invited to my thesis defense on April 10th.</span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span><span>Title: </span></span></span></strong><span><span><span>On Training, Inference, and Sample Efficiencies of Language Models</span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Date: </span></span></strong><span><span>04/10/2023<strong>&nbsp;</strong></span></span></span></span></span></p>

<p><span><span><span><strong><span><span>Time: </span></span></strong><span><span>12:00pm - 1:00 pm EST<strong>&nbsp;</strong></span></span></span></span></span></p>

<p><span><span><span><strong><span><span>Location:</span></span></strong><span><span>&nbsp;Zoom</span></span></span></span></span></p>

<p><span><span><span><span><span>Meeting URL: <a href="https://gatech.zoom.us/j/99505946703?pwd=ZlgvYTZjQjJLbnplTUtxSHJvelJtdz09">https://gatech.zoom.us/j/99505946703?pwd=ZlgvYTZjQjJLbnplTUtxSHJvelJtdz09</a> </span></span></span></span></span></p>

<p><span><span><span><span><span>Meeting ID: 995 0594 6703</span></span></span></span></span></p>

<p><span><span><span><span><span>Passcode: 644944</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Simiao Zuo</span></span></strong></span></span></span></p>

<p><span><span><span><span><span>Machine Learning PhD Student</span></span></span></span></span></p>

<p><span><span><span><span><span>School of Industrial and Systems Engineering</span></span><br />
<span><span>Georgia Institute of Technology</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Committee</span></span></strong></span></span></span></p>

<p><span><span><span><span><span>1 Dr. Tuo Zhao (Advisor, ISyE, Georgia Tech)</span></span></span></span></span></p>

<p><span><span><span><span><span>2 Dr. Chao Zhang (CSE, Georgia Tech)</span></span></span></span></span></p>

<p><span><span><span><span><span>3 Dr. Yajun Mei (ISyE, Georgia Tech)</span></span></span></span></span></p>

<p><span><span><span><span><span>4 Dr. Anqi Wu (CSE, Georgia Tech)</span></span></span></span></span></p>

<p><span><span><span><span><span>5 Dr. Xiaodong Liu (Microsoft Research, Microsoft)</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Abstract</span></span></strong></span></span></span></p>

<p><span><span><span><span><span>Large language models have demonstrated superior performance in various natural language processing tasks such as machine translation, natural language understanding, and natural language generation. However, despite the recent developments, language models still face critical challenges. In this thesis, we investigate efficient training and inference algorithms. We also investigate the sample efficiency of training language models.</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><span><span>In Chapter 2, we improve training efficiency of sparsely activated models by proposing a novel Mixture-of-Experts architecture. In Chapter 3, we propose state space augmented Transformer models, facilitating efficient modeling of long sequences. In Chapter 4, we target for inference efficiency of pre-trained language models. Specifically, we propose a knowledge distillation algorithm which adapts a pre-trained model into a Mixture-of-Experts model.&nbsp;In Chapter 5, we design a label efficient self-training algorithm. Specifically, we integrate differentiable teacher models into the conventional teacher-student self-training framework.</span></span></span></span></span></p>

<p>&nbsp;</p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[On Training, Inference, and Sample Efficiencies of Language Models]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><span><span><span><span>On Training, Inference, and Sample Efficiencies of Language Models</span></span></span></span></p>
]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2023-04-10T12:00:39-04:00]]></value>
      <value2><![CDATA[2023-04-10T13:00:39-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[ZOOM]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[https://gatech.zoom.us/j/99505946703?pwd=ZlgvYTZjQjJLbnplTUtxSHJvelJtdz09]]></url>
      <title><![CDATA[ZOOM]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
