<node id="667274">
  <nid>667274</nid>
  <type>event</type>
  <uid>
    <user id="27707"><![CDATA[27707]]></user>
  </uid>
  <created>1681312606</created>
  <changed>1681312606</changed>
  <title><![CDATA[PhD Defense by Sihan Zeng]]></title>
  <body><![CDATA[<p><span><span><span><strong><span><span><span>Title:&nbsp;</span></span></span></strong><span><span><span>Designing Policy Optimization Algorithms For Multi-Agent Reinforcement Learning</span></span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Date:&nbsp;</span></span></strong><span><span>Monday, April 24th</span></span></span></span></span></p>

<p><span><span><span><strong><span><span>Time:&nbsp;</span></span></strong><span><span>10:00am - 11:00am Eastern Time</span></span></span></span></span></p>

<p><span><span><span><strong><span><span><span>Teams link</span></span></span></strong><strong><span><span>:</span></span></strong> <span><span><a href="https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzQ5NDE2ZWItMDIxZi00Yzk0LWJiODEtMmM1MzY4ZThmMTky%40thread.v2/0?context=%7b%22Tid%22%3a%22482198bb-ae7b-4b25-8b7a-6d7f32faa083%22%2c%22Oid%22%3a%220e0acf27-edc3-43d4-8ef1-6945305e20e6%22%7d" target="_blank"><span>https://teams.microsoft.com/l/meetup-join/19%3ameeting_MzQ5NDE2ZWItMDIxZi00Yzk0LWJiODEtMmM1MzY4ZThmMTky%40thread.v2/0?context=%7b%22Tid%22%3a%22482198bb-ae7b-4b25-8b7a-6d7f32faa083%22%2c%22Oid%22%3a%220e0acf27-edc3-43d4-8ef1-6945305e20e6%22%7d</span></a></span></span></span></span></span></p>

<p>&nbsp;</p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Sihan Zeng</span></span></strong></span></span></span></p>

<p><span><span><span><span><span>Machine Learning PhD Student</span></span></span></span></span></p>

<p><span><span><span><span><span>School of Electrical and Computer Engineering</span></span></span></span></span></p>

<p><span><span><span><span><span>Georgia Institute of Technology</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span>Committee</span></span></strong></span></span></span></p>

<p><span><span><span><span><span>1 Dr. Justin Romberg (Advisor)</span></span></span></span></span></p>

<p><span><span><span><span><span>2 Dr.&nbsp;Siva Theja Maguluri</span></span></span></span></span></p>

<p><span><span><span><span><span>3 Dr. Guanghui Lan</span></span></span></span></span></p>

<p><span><span><span><span><span>4 Dr. Thinh T. Doan</span></span></span></span></span></p>

<p><span><span><span><span><span>5 Dr. Daniel Molzahn</span></span></span></span></span></p>

<p>&nbsp;</p>

<p><span><span><span><strong><span><span><span>Abstract</span></span></span></strong></span></span></span></p>

<p><span><span><span><span><span>The overall objective of the thesis is to enhance the understanding of structure in multi-agent reinforcement learning (RL) and to build reliable and efficient algorithms that exploit and/or respect the structure. First, we present a unified two-time-scale stochastic optimization framework under a special type of gradient oracle that abstracts a range of data-driven algorithms in RL. Targeting single-agent RL problems, this framework builds the mathematical foundation for designing and analyzing data-driven multi-agent RL algorithms. Second, we discuss the challenge and structure of multi-agent RL in multi-task cooperative and two-player competitive settings and leverage the structure to design provably convergent and efficient algorithms. In the final aim, we apply multi-agent RL to solve power system optimization problems. Specifically, we develop a RL-based penalty parameter selection method for the alternating current optimal power flow (ACOPF) problem solved via ADMM, with the goal of minimizing the number of iterations until convergence. Our method leads to significantly accelerated ADMM convergence compared to the state-of-the-art hand-designed parameter selection schemes and exhibits superior generalizability.</span></span></span></span></span></p>

<p>&nbsp;</p>
]]></body>
  <field_summary_sentence>
    <item>
      <value><![CDATA[Designing Policy Optimization Algorithms For Multi-Agent Reinforcement Learning]]></value>
    </item>
  </field_summary_sentence>
  <field_summary>
    <item>
      <value><![CDATA[<p>See below</p>
]]></value>
    </item>
  </field_summary>
  <field_time>
    <item>
      <value><![CDATA[2023-04-24T10:00:38-04:00]]></value>
      <value2><![CDATA[2023-04-24T23:00:38-04:00]]></value2>
      <rrule><![CDATA[]]></rrule>
      <timezone><![CDATA[America/New_York]]></timezone>
    </item>
  </field_time>
  <field_fee>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_fee>
  <field_extras>
      </field_extras>
  <field_audience>
          <item>
        <value><![CDATA[Faculty/Staff]]></value>
      </item>
          <item>
        <value><![CDATA[Public]]></value>
      </item>
          <item>
        <value><![CDATA[Graduate students]]></value>
      </item>
      </field_audience>
  <field_media>
      </field_media>
  <field_contact>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_contact>
  <field_location>
    <item>
      <value><![CDATA[TEAMS]]></value>
    </item>
  </field_location>
  <field_sidebar>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_sidebar>
  <field_phone>
    <item>
      <value><![CDATA[]]></value>
    </item>
  </field_phone>
  <field_url>
    <item>
      <url><![CDATA[]]></url>
      <title><![CDATA[]]></title>
            <attributes><![CDATA[]]></attributes>
    </item>
  </field_url>
  <field_email>
    <item>
      <email><![CDATA[]]></email>
    </item>
  </field_email>
  <field_boilerplate>
    <item>
      <nid><![CDATA[]]></nid>
    </item>
  </field_boilerplate>
  <links_related>
      </links_related>
  <files>
      </files>
  <og_groups>
          <item>221981</item>
      </og_groups>
  <og_groups_both>
          <item><![CDATA[Graduate Studies]]></item>
      </og_groups_both>
  <field_categories>
          <item>
        <tid>1788</tid>
        <value><![CDATA[Other/Miscellaneous]]></value>
      </item>
      </field_categories>
  <field_keywords>
          <item>
        <tid>100811</tid>
        <value><![CDATA[Phd Defense]]></value>
      </item>
      </field_keywords>
  <userdata><![CDATA[]]></userdata>
</node>
