<node id="670467">
  <nid>670467</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1697551704</created>
  <changed>1697551704</changed>
  <title><![CDATA[PhD Defense by Fan Bai]]></title>
  <body><![CDATA[<p><span><span><span><strong><span><span><span>Title</span></span></span></strong><span><span><span>: Information Extraction on Scientific Literature under Limited Supervision</span></span></span></span></span></span></p>

<p><span><span><span>&nbsp;&nbsp;</span></span></span></p>

<p><span><span><span><strong><span><span><span>Date</span></span></span></strong><span><span><span>: Monday, October 30, 2023</span></span></span></span></span></span></p>

<p><span><span><span><strong><span><span><span>Time</span></span></span></strong><span><span><span>: 2:30 PM – 4:30 PM ET</span></span></span></span></span></span></p>

<p><span><span><span><strong><span><span><span>Location</span></span></span></strong><span><span><span>:&nbsp;</span></span></span><span><span><span><a href="https://gatech.zoom.us/j/93262683372?pwd=NGd1ZUhlSHBzSDFUL3B5N29mQUdWdz09" title="https://gatech.zoom.us/j/93262683372?pwd=NGd1ZUhlSHBzSDFUL3B5N29mQUdWdz09">Zoom Link</a></span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span><span>Fan Bai</span></span></span></strong></span></span></span></p>

<p><span><span><span><span><span><span>Ph.D. Candidate in Computer Science</span></span></span></span></span></span></p>

<p><span><span><span><span><span><span>School of Interactive Computing</span></span></span></span></span></span></p>

<p><span><span><span><span><span><span>College of Computing</span></span></span></span></span></span></p>

<p><span><span><span><span><span><span>Georgia Institute of Technology</span></span></span></span></span></span></p>

<p><span><span><span>&nbsp;&nbsp;</span></span></span></p>

<p><span><span><span><strong><span><span><span>Committee</span></span></span></strong><span><span><span>:&nbsp;</span></span></span></span></span></span></p>

<p><span><span><span><span><span><span><span>Dr. Alan Ritter (Advisor) – School of Interactive Computing, Georgia Institute of Technology</span></span></span></span></span></span></span></p>

<p><span><span><span><span><span><span><span>Dr. Wei Xu – School of Interactive Computing, Georgia Institute of Technology&nbsp;</span></span></span></span></span></span></span></p>

<p><span><span><span><span><span><span><span>Dr. Zsolt Kira – School of Interactive Computing, Georgia Institute of Technology&nbsp;</span></span></span></span></span></span></span></p>

<p><span><span><span><span><span><span><span>Dr. Gabriel Stanovsky – School of Computer Science and Engineering, Hebrew University of Jerusalem</span></span></span></span></span></span></span></p>

<p><span><span><span><span><span><span><span>Dr. Hoifung Poon</span></span></span></span>&nbsp;<span><span><span><span>–&nbsp;</span></span></span></span><span><span><span><span>Microsoft Health Futures</span></span></span></span></span></span></span></p>

<p><span><span><span>&nbsp;&nbsp;</span></span></span></p>

<p><span><span><span><strong><span><span><span>Abstract</span></span></span></strong><span><span><span>:&nbsp;</span></span></span></span></span></span></p>

<p><span><span><span><span>The exponential growth of scientific literature presents both challenges and opportunities for researchers across various disciplines. Effectively extracting pertinent information from this extensive corpus is crucial for advancing knowledge, enhancing collaboration, and driving innovation. However, manual extraction is a laborious and time-consuming process, underscoring the demand for automated solutions. Information extraction (IE), a sub-field of natural language processing (NLP) focused on automatically extracting structured information from unstructured data sources, plays a crucial role in addressing this challenge. Despite their success, many IE methods often require substantial human-annotated data, which might not be easily accessible, particularly in specialized scientific domains. This highlights the need for adaptable and robust techniques capable of functioning with limited supervision.</span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><span><span>In this thesis, we study the task of information extraction on scientific literature, particularly addressing the challenge of limited (human) supervision. Specifically, our work has delved into three key dimensions of this problem.&nbsp;First, we explore the potential of harnessing easily accessible resources, like knowledge bases, to develop information extraction systems without direct human supervision. Next, we investigate the balance between the labor expenditure of human annotation and the computational cost of domain-specific pre-training, to achieve optimal performance under budget constraints.&nbsp;Lastly, we capitalize on the emerging capabilities of large pre-trained language models by showcasing how information extraction can be achieved with minimal demonstrations or solely based on a human-crafted data schema. Through these explorations, this thesis aims to lay a solid foundation for the continued advancement of scientific information extraction under limited supervision.</span></span></span></span></span></p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Information Extraction on Scientific Literature under Limited Supervision]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p><span><span><span>Information Extraction on Scientific Literature under Limited Supervision</span></span></span></p>
]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2023-10-30T14:30:00-04:00]]></value>
      <value2><![CDATA[2023-10-30T16:30:00-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[ZOOM]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
