Neural random fields with applications to modeling of languages and images

Zhijian Ou, Tsinghua University

ECEB 3032, 16:00-17:00, Monday, October 8, 2018

One of the core research problems in artificial intelligence is learning with probabilistic models, which could be broadly classified into two classes - directed and undirected graphical models. In the directed models (also known as Bayesian networks), the joint distribution is factorized into a product of local conditional probability functions, while in the undirected models (also known as random fields, energy-based models), the joint distribution is defined to be proportional to the product of local un-normalized potential functions. In contrast to the wide use of directed models, e.g. neural network based classifiers, variational AutoEncoders (VAEs), generative adversarial networks (GANs), auto-regressive models (e.g. RNNs), undirected models received less attention with slow progress.

Unlike directed models, random fields offer several advantages, including (1) They may express more compactly relationships between variables when the directionality of a relationship cannot be clearly defined; (2) They are computational more efficient in inference by avoiding local normalization; (3) They can be naturally utilized for semi-supervised learning; (4) Conditional random fields overcome the “label bias” problem in sequence transduction tasks. In this talk, we present our efforts in developing neural random fields (NRFs), which is defined by using neural networks, in stead of the traditional linear functions, to define the potential function. NRFs marry random fields and neural networks and further increase the modeling capacity. Superior experimental performances of NRFs are shown in various tasks such as language modeling for speech recognition, image generation, image classification, sequence tagging for natural language processing (NLP). We introduce a series of newly developed learning algorithms, which enable us to successfully train NRFs on large datasets in these tasks. Promisingly, as we show, NRFs provide a powerful tool for machine learning and merit more developments and applications.


  1. Zhijian Ou. A Review of Learning with Deep Generative Models from Perspective of Graphical Modeling. arXiv:1808.01630.
  2. Yunfu Song, Zhijian Ou. Learning Neural Random Fields with Inclusive Auxiliary Generators. arXiv:1806.00271.
  3. Bin Wang, Zhijian Ou. Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation. IEEE Workshop on Spoken Language Technology (SLT), Athens, Greece, 2018.
  4. Bin Wang, Zhijian Ou. Learning neural trans-dimensional random field language models with noise-contrastive estimation. ICASSP, Calgary, Canada, 2018.
  5. Bin Wang, Zhijian Ou, Zhiqiang Tan. Learning Trans-dimensional Random Fields with Applications to Language Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2018, vol.40, no.4, pp.876-890.
  6. Bin Wang, Zhijian Ou. Language modeling with neural trans-dimensional random fields. IEEE Automatic Speech Recognition and Understanding (ASRU) Workshop, Okinawa, Japan, 2017.
  7. Bin Wang, Zhijian Ou, Zhiqiang Tan. Trans-dimensional Random Fields for Language Modeling. Annual Meeting of the Association for Computational Linguistics (ACL Long Paper), Beijing, China, 2015.


Zhijian Ou received the B.S. degree with the highest honor in electronic engineering from Shanghai Jiao Tong University in 1998 and the Ph.D. degree in electronic engineering from Tsinghua University in 2003. Since 2003, he has been with the Department of Electronic Engineering in Tsinghua University and is currently an associate professor. From August 2014 to July 2015, he was a visiting scholar at Beckman Institute, University of Illinois at Urbana-Champaign. He has actively led research projects from NSF China (NSFC), China 863 High-tech Research and Development Program, and China Ministry of Information Industry, as well as joint-research projects with Intel, Panasonic, IBM, and Toshiba. He currently serves as China Computer Federation (CCF) Speech Conversation and Auditory Technical Committee Member, National Conference on Man-Machine Speech Communication (NCMMSC) Steering Committee Member. His recent research interests include speech processing (speech recognition and understanding, speaker recognition, natural language processing) and machine intelligence (particularly with graphical models).