Sugiyama Lab. Invited Guest Talks (Since 2007/7)



Date & Time
2012/3/30 13:30-14:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Hal Daume III (University of Maryland, USA)
Title
Structured Prediction need not be Slow
Abstract
Classic algorithms for predicting structured data (e.g., graphs, trees, etc.) rely on expensive (sometimes intractable) inference at test time. In this talk, I'll discuss several recent approaches that enable computationally efficient (e.g., linear-time) prediction at test time. These approaches fall in the category of learning algorithms that optimize accuracy for some fixed notion of efficiency. I'll conclude by considering the question: can a learning algorithm figure out how to make fast predictions on its own?

Date & Time
2012/3/30 14:30-15:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Ruslan Salakhutdinov (University of Toronto, Canada)
Title
Learning Hierarchical Models
Abstract
Building intelligent systems that are capable of extracting meaningful representations from high-dimensional data lies at the core of solving many Artificial Intelligence tasks, including speech perception, visual object recognition, information retrieval, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep hierarchical structure that support inferences at multiple levels. In this talk, I will introduce a broad class of probabilistic generative models called Deep Boltzmann Machines (DBMs), and a new algorithm for learning these models that uses variational methods and Markov chain Monte Carlo. I will show that DBMs can learn useful hierarchical representations from large volumes of high-dimensional data, and that they can be successfully applied in many domains, including speech perception, information retrieval, object recognition, and nonlinear dimensionality reduction. I will then describe a new class of more complex probabilistic graphical models that combine Deep Boltzmann Machines with structured hierarchical Bayesian models, called Hierarchical-Deep (HD) Models. I will show how these models can learn a deep hierarchical structure for sharing knowledge across hundreds of visual categories, which allows accurate learning of novel visual concepts from few examples.

Date & Time
2012/3/5 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Minh Ha Quang (Italian Institute of Technology, Italy)
Title 1
Vector-valued reproducing kernel Hilbert spaces and applications
Abstract 1
Kernel methods have recently emerged as a powerful framework for many machine learning and data mining applications. Most of the literature on kernel methods so far has focussed on scalar-valued kernels. In this talk, we will give an overview of the theory of operator-valued positive definite kernels and their associated vector-valued reproducing kernel Hilbert spaces (RKHS).
We will present two sets of applications. The first is for the problem of colorization of black and white images (joint work with Sung Ha Kang and Triet Le, Journal of Mathematical Imaging and Vision, 2010).
The second, which is joint work with Vikas Sindhwani (ICML 2011), is on vector-valued manifold regularization, with examples in multi-label image classification and hierarchical text categorization.
Title 2
Slow feature analysis and decorrelation filtering for separating correlated sources
Abstract 2
Slow Feature Analysis (SFA) is a method for extracting slowly varying features from input signals. In this talk, we generalize SFA to vector-valued functions of multivariables and apply it to the problem of blind source separation, in particular image separation. When the sources are correlated, we apply the following technique called decorrelation filtering: use a linear filter to decorrelate the sources and their derivatives, then apply the separating matrix obtained on the filtered sources to the original sources. We show that if the filtered sources are perfectly separated by this matrix, then so are the original sources. We show how to numerically obtain such a decorrelation filter by solving a nonlinear optimization problem. This technique can also be applied to other linear separation methods, whose output signals are uncorrelated, such as ICA.
This is joint work with Laurenz Wiskott (ICCV 2011).

Date & Time
2012/1/10 15:00-16:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Milan Vojnovic (Microsoft Research Cambridge, UK)
Title
Continuous Distributed Counting for Non-Monotonic Streams
Abstract
We consider the continual count tracking problem in a distributed environment where the input is an aggregate data stream originating from k distinct sites and the updates are allowed to be non-monotonic, i.e., both increments and decrements are allowed. The goal is to continually track the count within a prescribed relative accuracy \epsilon at the lowest possible communication cost. Specifically, we consider an adversarial setting where the input values are selected and assigned to sites by an adversary but the order is according to a random permutation or is a random i.i.d process. The input stream of values is allowed to be non-monotonic with an unknown drift -1\leq \mu \leq 1 where the case \mu = 1 corresponds to the special case of a monotonic stream of only non-negative updates. We show that a randomized algorithm guarantees to track the count accurately with high probability and has the expected communication cost \tilde O(\min{\sqrt{k}/(|\mu| \epsilon), \sqrt{k n}/\epsilon, n}), for an input stream of length n, and establish matching lower bounds. Last but not least, we also provide an algorithm and a communication complexity upper bound for a fractional Brownian motion input, and show how our non-monotonic counter can be applied to track the second frequency moment and to a Bayesian linear regression problem.

Joint work with Zhenming Liu and Bozidar Radunovic.

Date & Time
2011/11/4 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Marco Cuturi (Kyoto University, Japan)
Title
Ground Metric Learning
Abstract
Transportation distances have been used for more than a decade now in machine learning to compare histograms of features. They have one parameter: the ground metric, which can be any metric between the features themselves. As is the case for all parameterized distances, transportation distances can only prove useful in practice when this parameter is carefully chosen. To date, the only option available to practitioners to set the ground metric parameter was to rely on a priori knowledge of the features, which limited considerably the scope of application of transportation distances. We propose to lift this limitation and consider instead algorithms that can learn the ground metric using only a training set of labeled histograms. We call this approach ground metric learning. We formulate the problem of learning the ground metric as the minimization of the difference of two polyhedral convex functions over a convex set of distance matrices. We follow the presentation of our algorithms with promising experimental results on binary classification tasks using GIST descriptors of images taken in the Caltech-256 set.
Preprint

Date & Time
2011/11/02 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Moritz Grosse-Wentrup (Max-Planck Institute, Germany)
Title
What are the Neurophysiological Causes of Performance Variations in Brain-Computer Interfacing?
Abstract
When a subject operates a non-invasive brain-computer interface (BCI), the system correctly infers the subject's intention in some trials, yet fails to make the right decision in other trials. As the algorithm used to decode brain signals is typically fixed, the reason for this variation in performance has to be found in the subject's brain states. In this talk, I argue that distributed gamma-range oscillations play a major role in determining BCI-performance. In particular, I present empirical evidence that gamma-range oscillations modulate the sensorimotor-rhythm [1], and may be used to predict BCI-performance on a trial-to-trial basis [2]. I further present preliminary evidence that feedback of fronto-parietal gamma-range oscillations may be used to induce a state-of-mind beneficial for operating a BCI [3].

References:
1. Grosse-Wentrup, M., B. Scholkopf and J. Hill. Causal Influence of Gamma Oscillations on the Sensorimotor Rhythm. NeuroImage 56(2), pp. 837-842, 2011.
2. Grosse-Wentrup, M., Fronto-Parietal Gamma-Oscillations are a Cause of Performance Variation in Brain-Computer Interfacing. Proceedings of the 5th International IEEE EMBS Conference on Neural Engineering (NER 2011), pp. 384-387, 2011.
3. Grosse-Wentrup, M. Neuro-Feedback of Fronto-Parietal Gamma-Oscillations. 5th International BCI Conference, Graz, Austria, 2011.

Date & Time
2011/10/28 13:20-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Jun-ichiro Hirayama (Kyoto University, Japan)
Title
Bregman Divergence as General Framework to Estimate Unnormalized Statistical Models
Abstract
A parametric statistical model often has an intractable normalization factor which makes standard maximum likelihood estimation impractical. Recently, several alternative methods have been proposed to deal with this difficulty in the estimation of "unnormalized" statistical models, where "unnormalized" means that the model has intractable normalization factor, or even has no normalization factor. A classical example is Pseudolikelihood proposed by Besag for discrete MRF; other recent examples includes Contrastive Divergence, Score Matching, Ratio Matching, Noise-Contrastive Estimation and its generalization.

We have recently shown that minimization of Bregman divergence (BD) provides a rich framework to estimate unnormalized statistical models, which unifies and generalizes some of the existing principles. This talk is about some selected pieces from this study, with a few new results. I will first introduce the problem of estimating unnormalized models, and then show how the Noise-Contrastive Estimation and its generalization can be interpreted as BD minimization. I will also be pointing out its connection to a framework of "density ratio estimation" using BD. I will finally show that the proposed framework also contains Score Matching, Ratio Matching and Pseudolikelihood as special cases.

Date & Time
2011/8/5 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Oliver Kroemer (Max-Planck Institute, Germany)
Title
Learning Dynamic Tactile Sensing with Robust Vision-based Training
Abstract
Dynamic tactile sensing is a fundamental ability for recognizing materials and objects. However, while humans are born with partially developed dynamic tactile sensing and master this skill quickly, today’s robots remain in their infancy. The development of such a sense requires not only better sensors, but also the right algorithms to deal with these sensors’ data. For example, when classifying a material based on touch, the data is noisy, high-dimensional and contains irrelevant signals as well as essential ones. Few classification methods from machine learning can deal with such problems. In this talk, I will discuss an efficient approach to inferring suitable lower-dimensional representations of the tactile data. In order to classify materials based on only the sense of touch, these representations are autonomously discovered using visual information of the surfaces during training. However, accurately pairing vision and tactile samples in real robot applications is a difficult problem. The proposed approach therefore works with weak pairings between the modalities. Experiments show that the resulting approach is very robust and yields significantly higher classification performance based on only dynamic tactile sensing.

Date & Time
2011/6/16 10:45-12:15
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Pritee Khanna (Indian Institute of Information Technology, India)
Title
Content-based image retrieval
Abstract
In recent years, with the increasing needs of multimedia information retrieval on the Internet, the research and application of image retrieval technology conforms to the development trend and has a bright future. But, it still faces many technical problems for achieving fast and efficient image retrieval on the Internet, which is restricted in the extensive and in-depth application and is become one of the research focuses. However, it is relatively difficult to achieve the co-ordination between the system response time and the image retrieval accuracy in the distributed networks which store massive unstructured or semi-structured data.

Content-based image retrieval has taken the low-level visual features (color, texture, shape and object, etc.) as research priorities of image retrieval since the early 1990s. Some of its important characteristics are intuitiveness (example description), efficiency (similarity matching), and universality (query without the help of domain knowledge). These characteristics are applied to overcome some defects of keywords-based image retrieval, such as subjectivity (unintuitive retrieval results), ambiguity (inaccuracy content description of image with natural language) and inconvenience (large massive of manual annotations of image), and shown a vigorous development trend. Content-based image retrieval has inevitably shown some shortcomings at the same time. In order to guarantee retrieval accuracy, the extracted image has features of great dimensions and they rise drastically with the improvement of retrieval accuracy. This increases the burden of indexing by a great amount and decreases the efficiency of retrieval.

I will discuss the issues which need to be focused for the development of an effective CBIR system.

Date & Time
2011/5/31 13:30-15:00
Venue
Seminar Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Nigel Collier (National Institute of Informatics, Japan)
Title
Web Sensing for Real Time Disaster Detection and Tracking
Abstract
Accurate and timely detection of public health disasters such as the spread of infectious diseases and chemical contamination are necessary to help support risk assessment and ultimately to save lives and livelihoods. In this talk I will present progress on the JST funded BioCaster project. BioCaster exploits high throughput biomedical text mining from global news media to detect norm violations in near real time. Additionally, I will discuss our recent investigation into tracking syndromic trends from user generated content in the DIZIE project and show how social media can complement news events in both spatial and temporal resolution. Early results for DIZIE illustrate how selected features are highly correlated with laboratory data for influenza. Ongoing challenges will also be discussed including: (1) bridging the gap between laymen's and expert's terminology, (2) integrating evidence across documents and information spaces, and (3) providing realistic benchmarks.

Date & Time
2011/5/13 15:30-17:00
Venue
Seminar Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Masanori Kawakita (Kyushu University, Japan)
Title
A Class of Semi-Supervised Learning in View of Statistical Paradox and Its Model Selection
Abstract
We analyze the performance of a certain class of semi-supervised regression and propose its model selection. In a semi-supervised learning, it is often assumed that the number of labeled data is quite few. As for model selection, however, almost conventional semi-supervised methods use AIC or cross-validation based on few labeled data. This leads to a large variance in risk estimation.

First, we focus a certain class of semi-supervised regression, which is based on the weighted likelihood with the ratio between labeled data density p(x) and unlabeled data density p'(x). We refer to this approach as Density-Ratio- Estimation-based Semi-Supervised (DRESS) regression in this talk. This approach has been studied well in a situation where p'(x) differs from p(x). If p'(x)=p(x), DRESS approach seems meaningless because the target density ratio p'(x)/p(x) is trivially one at any x, leading to the usual least-squares estimator (LSE). Indeed, almost no theory has guaranteed the performance of DRESS when p'=p so far to our knowledge. However, we can prove that DRESS improves the risk of LSE under some conditions. This issue has an analogical structure with a statistical paradox "Even if we know a true value of nuisance parameter, estimating it improves the accuracy in some situations". This analogy plays a central role in the above proof.

Second, we propose a new risk estimator for DRESS regression, which is referred to as Criterion-based-on-Risk-Of-Semi-Supervised regression (CROSS). Its derivation does not require a large sample assumption, prior knowledge of noise variance and distribution. DRESS+CROSS performs better than LSE under model misspecification, while it performs equally or slightly worse than LSE when the model is correctly specified. Thus, it is necessary to estimate whether the model is correct or not.

Third study solves this issue to some extent. Simulations illustrate the performance of these proposals.

Date & Time
2011/3/17 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
大知 正直(電気通信大学)
Title
レビュー文を利用したランキング関数の特徴量の提案
Abstract
ユーザの評価値を予想するランキング関数において,他のユーザ群の評価 値を特徴量として採用すると,有為な値がスパースになることが知られている. 本研究では,各ユーザの評価値とともに記されたレビュー文を利用する新たな特 徴量の提案を行い,有為な値のスパース性を改善できることを示した.また,実 際のレビューデータを元にした評価実験の結果,従来手法と比較してユーザ評価 値の予想精度が改善されることを示した.

Date & Time
2011/3/9 10:00-11:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Jen-Tzung Chien (National Cheng Kung University, Taiwan)
Title
Bayesian and Sparse Learning of Acoustic and Language Models
Abstract
In this talk, I will present my recent studies on machine learning and speech recognition. Speech recognition involves extensive knowledge of machine learning and statistical modeling. Both acoustic modeling and language modeling are important parts of modern speech recognition algorithms. In acoustic modeling, I will introduce a sparse representation of acoustic features based on a set of state-dependent basis vectors. The Bayesian sensing hidden Markov models can be established from the heterogeneous training data. The hybrid dictionary learning and sparse representation is performed. In language modeling, I will address the topic model and present a Dirichlet class language model, which projects the sequence of history words onto a latent class space and calculates a marginal likelihood over the uncertainties of classes, which are expressed by Dirichlet priors. A Bayesian class-based language model is established and a variational Bayesian inference procedure is presented. In this presentation, I will report different evaluations on large vocabulary continuous speech recognition and briefly address some other works we are doing now on different topics of machine learning.

Short Bio: Jen-Tzung Chien received his Ph.D. degree in electrical engineering from the National Tsing Hua University, Hsinchu, Taiwan, in 1997. Since 1997, he has been with the Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, where he is currently a Professor. He held the Visiting Professor positions at the Panasonic Technologies Inc., Santa Barbara, CA, the Tokyo Institute of Technology, Tokyo, Japan, the Georgia Institute of Technology, Atlanta, GA, the Microsoft Research Asia, Beijing, China, and the IBM T. J. Watson Research Center, Yorktown Heights, NY. His research interests include machine learning, speech recognition, face recognition, information retrieval and signal separation.

Date & Time
2011/1/18 10:00-11:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Yee Whye Teh (University College London, UK)
Title
Hierarchical Bayesian Models of Language and Text
Abstract
In this talk I will present a new approach to modelling sequence data called the sequence memoizer. As opposed to most other sequence models, our model does not make any Markovian assumptions. Instead, we use a hierarchical Bayesian approach which enforces sharing of statistical strength across the different parts of the model. To make computations with the model efficient, and to better model the power-law statistics often observed in sequence data, we use a Bayesian nonparametric prior called the Pitman-Yor process as building blocks in the hierarchical model. We show state-of-the-art results on language modelling and text compression.

This is joint work with Frank Wood, Jan Gasthaus, Cedric Archambeau and Lancelot James.

Date & Time
2010/11/15 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Fernando Villavicencio (Yamaha Corporation, Japan)
Title
Application of Voice-Conversion to Singing-Voice
Abstract
In this talk we will present the main features of our work concerning the application of Voice Conversion to Singing-Voice in order to achieve singer-timbre conversion on Yamaha's VOCALOID singing-synthesizer. Our main goal is to find the transformation of singing-voice samples of a source singer in order to perceive the timbre of a desired target singer. The timbre-conversion framework is based on a probabilistic conversion function derived after Gaussian Mixture Modeling of spectral envelope features. We will describe the main parts of this work as well as the results of the study of several issues as the spectral envelope modeling, the statistical modeling of the features and the derivation of the timbre mapping from un-paired source-target data.

Date & Time
2010/11/11 10:30-12:00
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Thomas G. Dietterich (Oregon State University, USA)
Title
Fine-Grained Visual Categorization and the Problem of Novel Objects
Abstract
Fine-grained visual categorization is the problem of discriminating among very similar objects (e.g., species of animals, makes of automobiles). For the past seven years, we have been developing methods for fine-grained categorization of aquatic macro invertebrates (insect larvae that live in freshwater streams). This talk will discuss the computer vision and machine learning methods that we have developed and that show performance exceeding 88% correct on 29 species of aquatic macro invertebrates. An important challenge in this application is that insects belonging to species outside the training set can arise frequently, so the vision system must detect that these do not belong to any of the classes known to the system. We will discuss various methods that we have applied to this problem, and speculate on how we can improve the performance of these methods.

Date & Time
2010/10/26 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Yasuo Tabei (Japan Science and Technology Agency, Japan)
Title
SketchSort: Fast All Pairs Similarity Search Method by Multiple Sorting
Abstract
Recently, it is increasingly common that images and signals are represented as vectorial data. To save memory and improve speed, vectorial data are often represented as binary strings called sketches. Chariker (2002) proposed a fast approximate method for finding neighbor pairs of sketches by sorting and scanning with a small window. This method, which we shall call “single sorting”, is applied to locality sensitive codes and prevalently used in speed-demanding web-related applications. In this presentation, we present the multiple sorting method, which combines blockwise masking and radixsort. Additionally, the average false negative rate is computable and duplicated discoveries are deliberately avoided. In empirical experiments on a large-scale image dataset, it is shown that it is much faster than cover tree and Lanczos bisection.

Date & Time
2010/10/7 16:30-18:00
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
井手剛(IBM東京基礎研究所)
Title
ネットワーク上のトラジェクトリ回帰問題について
Abstract
台風の軌跡や、店舗内の人の動き、あるいは地図上の自動車の動きなど、移動体の軌跡(トラジェクトリ)からの知識発見技術は、最近のデータマイニングにおける興味深い話題のひとつである。我々は最近、「トラジェクトリ回帰」、すなわちトラジェクトリのコストを予測する問題を、カーネル回帰の枠組みで定式化した(T. Ide and S. Kato, SDM 2009)。本講演では、それと別の定式化をたどることで、実用上より有用なコスト予測方式が得られることを示す。同時に、トラジェクトリに対するカーネル関数についての新しい理解が得られることを示す。また、具体的な応用として、地図上の交通流解析を取り上げ、どのような研究課題が存在するかを議論したい。

Date & Time
2010/7/26 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Ivor Wai-Hung Tsang (Nanyang Technological University, Singapore)
Title
Non-parametric Kernel Learning: Algorithms and Applications
Abstract
Previous studies of Non-Parametric Kernel Learning (NPKL) usually formulate the learning task as a Semi-Definite Programming (SDP) problem that is often solved by some general purpose SDP solvers. However, for N data examples, the time complexity of NPKL using a standard interior-point SDP solver could be as high as O(N^6.5), which prohibits NPKL methods applicable to real applications, even for datasets of moderate size. In this paper, we present a family of efficient NPKL algorithms, termed “SimpleNPKL”, which can learn non-parametric kernels from a large set of pairwise constraints efficiently. In particular, we propose two efficient SimpleNPKL algorithms. One is SimpleNPKL algorithm with linear loss, which enjoys a closed-form solution that can be efficiently computed by the Lanczos sparse eigen decomposition technique. Another one is SimpleNPKL algorithm with other loss functions (including square hinge loss, hinge loss, square loss) that can be re-formulated as a saddle-point optimization problem, which can be further resolved by a fast iterative algorithm. In contrast to the previous NPKL approaches, our empirical results show that the proposed new technique, maintaining the same accuracy, is significantly more efficient and scalable. Finally, we also demonstrate that the proposed new technique is also applicable to speed up many kernel learning tasks, including colored maximum variance unfolding, minimum volume embedding, and structure preserving embedding.

Besides SimpleNPKL, we also propose a novel non-parametric spectral kernel learning method which can seamlessly combine manifold structure of unlabeled data and Regularized Least-Squares (RLS) to learn a new kernel. Interestingly, the new kernel matrix can be obtained analytically with the use of spectral decomposition of graph Laplacian matrix. Hence, the proposed algorithm does not require any numerical optimization solvers. Moreover, by maximizing kernel target alignment on labeled data, we can also learn model parameters automatically with a closed-form solution. For a given graph Laplacian matrix, our proposed method does not need to tune any model parameter including the tradeoff parameter in RLS and the balance parameter for unlabeled data. Extensive experiments on ten benchmark datasets show that our proposed non-parametric and parameter-free spectral kernel learning algorithm can obtain comparable performance with fine-tuned manifold regularization methods in transductive setting, and outperform multiple kernel learning in supervised setting.


Date & Time
2010/6/11 13:30-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Yukikazu Hidaka (University of Southern California, USA, and ATR Computational Neuroscience Laboratories, Japan)
Title
Use it and improve it, or lose it: Non-linear interactions between arm and hand use and function during stroke recovery
Abstract
In this talk, we introduce our research about neuro-computational rehabilitation for patients with stroke. Stroke-affected arm use in daily life presumably forms a part of effective rehabilitation therapy. However, there is little understanding of the interactions between arm use and function in humans post-stroke. In a previous computational study (Han, Arbib, and Schweighofer, 2008), we suggested that the dependence of function on use is non-linear after therapy: above a threshold of function, use will spontaneously improve, and in turn, function further improves; below this threshold, use and function of the affected limb will plateau or deteriorate, and compensatory strategies will develop further.

Here, we directly test this hypothesis, by developing a 1st order dynamical model with non-linear interactions between function and use, and by analyzing how this model can account for actual stroke recovery data. Using a Bayesian framework, we systematically compared this model to other time-varying models with and without interactions between function and use. To train the parameters of all the models, we used data from the immediate treatment group of the EXCITE clinical trial (Wolf et al. 2006) in which use and function data were collected following two weeks of therapy in four month intervals for 2 years.

Comparison of the model evidence probabilities showed that the best fitting model was our 1st order dynamical model with the non-linear interaction between function and uses. We also predicted that the recovery process of each patient, and categorized patients into the vicious or vicious group, by using a threshold surface of long term arm use estimate. Finally, we compared model parameters before and after therapy and found that the only parameter which increased is related to the motivation to use the affected arm. Our results suggest that after rehabilitation, the interaction between function and use is a crucial factor for functional recovery.


Date & Time
2010/5/31 13:30-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Sungyoung Kim (Yamaha Corporation, Japan)
Title
Beyond surround: towards enhanced immersive presence of auditory information
Abstract
This presentation covers the psychoacoustical principles of the conventional multichannel audio system, proposes how should such principles apply for the future 3-dimensional audio, and introduces the influence of non auditory cues for enhanced immersive feeling of "being there." As a case study, this talk introduces a newly developed signal processing method by Yamaha, which creates virtually elevated auditory imagery via a conventional 5.1 channel reproduction system. As an interim procedure between the current surround audio and the future periphonic audio, the proposed method allows listeners to experience vertically extended space where musicians and composers can express their musical expression better.

Date & Time
2010/3/1 10:30-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Katja Hansen (Technical University of Berlin, Germany)
Title
Machine Learning in Drug Discovery and Design
Abstract
Within the past two decades, Machine Learning methods have been established in a variety of applications in the field of computational chemistry. Due to the complex nature of drug design, these methods serve as perfect tools to decrease development time, cost and use of chemical resources.

Starting from a general overview on drug discovery the talk will focus on different problems related to the specific requirements of the algorithms arising in this field of research. In particular the question of interpretability is of great importance for a drug designing scientist: Complex Machine Learning approaches in general result in black box models - while delivering excellent prediction performance, most of these methods will provide no answer as to why the model predicts a particular label (e.g. toxic/non-toxic) for a certain molecule. Given the immense impact for the following drug development steps and the correlated costs, the certainty of a prediction is nearly as precious for the chemist as the prediction itself. Two different approaches on confidence estimation will be introduced and evaluated on Ames mutagenicity data. Both focus on kernel based Machine Learning algorithms in particular Gaussian processes. Finally additional approaches to enhance machine learning in drug discovery will be discussed.

Date & Time
2009/9/28 14:00-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Klaus-Robert Müller (Technical University of Berlin, Germany)
Title
Denoising and Dimension Reduction in Feature Space
Abstract
The talk presents recent work that interestingly complements our understanding of the VC picture in kernel based learning. Our finding is that the relevant information of a supervised learning problem is contained up to negligible error in a finite number of leading kernel PCA components if the kernel matches the underlying learning problem. Thus, kernels not only transform data sets such that good generalization can be achieved using only linear discriminant functions, but this transformation is also performed in a manner which makes economic use of feature space dimensions. In the best case, kernels provide efficient implicit representations of the data for supervised learning problems. Practically, we propose an algorithm which enables us to recover the subspace and dimensionality relevant for good classification. Our algorithm can therefore be applied (1) to analyze the interplay of data set and kernel in a geometric fashion, (2) to aid in model selection, and to (3) denoise in feature space in order to yield better classification results.

We complement our theoretical findings by reporting on applications of our method to data from gene finding and brain computer interfacing.

This is joint work with Claudia Sanelli, Mikio Braun and Joachim M. Buhmann.

Date & Time
2009/7/17 13:20-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Toru Wakahara (Hosei University, Japan)
Title
Affine-Invariant Recognition of Face Images Using GAT Correlation
Abstract
My talk addresses a challenging problem of performing normalization and recognition of face images at one time. The key idea is use of Global Affine Transformation (GAT) correlation for determining optimal 2D affine parameters that normalize a given image to yield the maximum correlation value with a target image. The GAT correlation method assigns an input face image to the face template having the largest GAT correlation value among all of enrolled face templates. Experimental results using the public HOIP face image database demonstrates a very high recognition rate of 99.79%. Moreover, the proposed method successfully matches face templates with their artificially affine-transformed images subject to rotation within 45 degrees, scale change within 50 percent, and translation within 25 percent of the face extent.

Date & Time
2009/6/1 15:00-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Paul von Bünau (Technical University of Berlin, Germany)
Title
Stationary subspace analysis
Abstract
Non-stationarities are an ubiquitous phenomenon in statistical data analysis, yet they pose a challenge to standard Machine Learning methodology since the classic assumption of a stationary data generating process is violated. Conversely, understanding the nature of observed non-stationary behaviour often lies at the heart of a scientific question. To this end, we propose a novel unsupervised technique: Stationary Subspace Analysis (SSA). SSA decomposes a multi-variate time-series into its stationary and non-stationary components. In this context, we also investigate the occurrence of spurious stationarity and provide useful theoretical results on the circumstances under which spurious stationary components arise. We demonstrate the performance of our novel concept in extensive simulations and present a real world application to Brain Computer Interfacing.

Date & Time
2009/3/19 16:00-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, No.26)
Speaker
Yu Takahashi (Nara Institute of Science and Technology, Japan)
Title
Musical noise analysis for integration method of microphone array and nonlinear signal processing with higher-order statistics
Abstract
In recently years, for better noise reduction, integration methods of microphone array signal processing and nonlinear signal processing have been researched. Indeed the integrated method can achieve good noise reduction performance, but a nonlinear processing in the method causes an artificial distortion, so-called musical noise. Since such a musical noise makes user uncomfortable, it is desired that such a musical noise is mitigated. Moreover, in these days, it is reported that higher-order statistics is strongly related with the amount of generated musical noise. Thus, we analyze the integrated method of microphone array signal processing and nonlinear signal processing, based on higher-order statistics. Also, we propose an architecture for reducing musical noise based on the analysis.

Date & Time
2008/9/11 13:30-
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Taiji Suzuki (University of Tokyo, Japan)
Title
A least-squares approach to mutual information estimation with application in variable selection
Abstract
We propose a new method of estimating mutual information from samples. Our method, called Least-Squares Mutual Information (LSMI), has several attractive properties, e.g., density estimation is not involved, an analytic-form solution is available, a variant of cross-validation can be used for model selection, and an approximate leave-one-out error can be computed very efficiently. Numerical experiments show that LSMI compares favorably with existing methods in mutual information estimation and variable selection. The practical usefulness of LSMI is demonstrated also in protein subcellular localization prediction.

Date & Time
2008/8/20 13:20-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Ron Begleiter (Technion Israel Institute of Technology, Israel)
Title
Repairing self-confident active-transductive learners using systematic exploration
Abstract
We consider an active learning game within a transductive learning model. A major problem with many active learning algorithms is that an unreliable current hypothesis can mislead the querying component to query "uninformative" points. In this work we propose a remedy to this problem. Our solution can be viewed as a "patch" for fixing this deficiency and also as a proposed modular approach for active transductive learning that produces powerful new algorithms. Extensive experiments on "real" data demonstrate the advantage of our method.

Reference:
R. Begleiter, R. El-Yaniv, and D. Pechyony, Repairing self-confident active-transductive learners using systematic exploration, Pattern Recognition Letters, 29(9), 1245--1251, 2008.

Date & Time
2008/5/19 10:30-12:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
安田宗樹(東北大学)
Title
統計的近似理論を用いたボルツマンマシンの近似学習則
Abstract
ボルツマンマシン(Boltzmann machine)はネットワーク内部に信号帰還のループ構造 を含む相互結合型のニューラルネットワークの一種であり,連想記憶モデルの代表の 一つとして知られるホップフィールドモデルに確率的な状態遷移を持たせた拡張とし てみなすことができる確率的ニューラルネットワークである. ボルツマンマシンのもつ豊富な構造から,さまざまな最適化問題やパターン認識問題 等への応用が期待されているが,その学習にはギブス分布の平均と相関を計算する必 要があり,まともにそれを実行しようとすると非常に膨大な計算時間を必要としてしまう. そこで平均場理論をはじめとした様々な統計的近似理論を用いた近似学習則が古くか ら研究されてきている.本講演では,最近情報科学の諸分野で広く利用されているビ リーフプロパゲーションに線形応答近似と呼ばれる近似手法を組み合わせた隠れ素子 なしのボルツマンマシンに対する新しい近似学習則を紹介する. 線形応答近似はビリーフプロパゲーションの相関の近似精度を向上させることが知られて おり,従来の近似学習則より高い近似精度を期待できる. また本講演では,隠れ素子がある場合の近似学習の戦略についても議論する.

■参考文献
[1]M. Yasuda and T. Horiguchi: Triangular approximation for Ising model and its application to Boltzmann machine, Physica A, vol. 836, pp. 83-95, 2006.
[2]M. Yasuda and K. Tanaka: The Mathematical Structure of the Approximate Linear Response Relation, J. Phys. A: Math. and Theor., vol. 40. pp. 9993-10007, 2007.

Date & Time
2008/4/3 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
後藤順哉(中央大学)
Title
汎化誤差評価に基づくポートフォリオ選択
Abstract
手持ちの資金をどの資産にどれだけ投資するのか(すなわち、投資配分)を決 定する問題をポートフォリオ選択という。伝統的なポートフォリオ選択モデル は金融資産の収益率分布を、事前に決めた基準に則ってin-sampleの意味で最 適にするよう投資配分を決定するが、通常、サンプルの数が限られるため、場 合によっては大きな推定誤差が生じるものと考えられる。また、金融資産の収 益率分布の特定は困難であることから、ノンパラメトリックな仮定に基づく裏 付けが望まれる。本研究では外れ値検出のモデルとしても知られる1クラス nu-SVMとポートフォリオ選択問題との類似性から、ポートフォリオに対する汎 化誤差(のようなもの)を評価し、それに基づく新たなポートフォリオ選択モ デル、およびその解法を提示する。このモデルは伝統的なモデルと異なり、 out-of-sampleのパフォーマンス向上を目指したものである点が新しいが、従 来からポートフォリオ選択の基準として用いられてきたVaR、CVaRの最小化に 深く関係することから、それらのパフォーマンスに対する理論的な裏付けも与 えることになる。(この研究は武田@東工大との共同研究である)

Date & Time
2008/3/19 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Justin Dauwels (Massachussetts Institute of Technology, USA)
Title
Machine Learning Techniques for Quantifying Neural Synchrony: Application to the Early Diagnosis of Alzheimer's Disease from EEG
Abstract
We present a novel approach to measure the interdependence of multiple time series, referred to as "stochastic event synchrony" (SES). As a first step, "events" from the given time series are extracted, next, those events are aligned. The better the alignment, the more the time series are considered to be similar. The similarity measure is computed by performing statistical inference on a sparse graph. As an application, we consider the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients. We present some results and discuss ideas for future research.

This talk is based on joint work with F. Vialatte (RIKEN, Japan), Theophane Weber (MIT), and A. Cichocki (RIKEN, Japan).

Date & Time
2008/1/31 15:00-16:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Kengo Kato (University of Tokyo, Japan)
Title
On the degrees of freedom in shrinkage estimator
Abstract
We study the degrees of freedom in shrinkage estimation of the regression coefficients. Generalizing the idea of the Lasso, we consider the problem of estimating the coefficients by the projection of the ordinary least squares estimator onto a closed convex set. Then an unbiased estimator of the degrees of freedom is derived in terms of geometric quantities under a smoothness condition on the boundary of the closed convex set. The result presented in this paper is applicable to estimation with a wide class of constraints. As an application, we obtain a Cp-type criterion and AIC for selecting the tuning parameter.

Reference: Technical Report
 
Date & Time
2008/1/8 15:00-16:30
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
前田賢一(東芝)
Title
画像認識 −技術と応用の最前線−
Abstract
画像認識の最前線の紹介です。利用される技術(ハード、 ソフト、アルゴリズム)と、それらがに応用される場面を、 実例(顔認識、車載障害物検出など)を交えて紹介します。

Date & Time
2007/12/20 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
中島伸一(ニコン)
Title
Wishart行列の極限固有値分布を利用した特異モデルの汎化誤差解析
Abstract
確率ベクトルの各成分が平均0の正規分布に従うとき,その共分散行列は Wishart分布に従う.元の正規分布の各成分が独立であるとし,次元と サンプル数の比を一定に保ったままそれらを大きくしていくと,共分散 行列の固有値密度はある関数に概収束することが知られている(Marcenko- Pastur則).本トークでは,この性質を利用して縮小ランク回帰モデル の汎化性能を解析した例を紹介する. 実は,極限固有値分布は元の確率変数の正規性には依存しない.(ただし, 独立性は本質的である.)また,他の種類のランダム行列に対し,Wigner の半円則および円則が知られている.このような,より一般的なランダム 行列の性質についても簡単に触れる.

Date & Time
2007/11/20 13:30-15:00
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Jan Peters (Max-Planck Institute, Germany)
Title
Towards Motor Skill Learning in Robotics
Abstract
Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can learn tasks triggered by environmental context or higher level instruction. However, learning techniques have yet to live up to this promise as only few methods manage to scale to high-dimensional manipulator or humanoid robots. In this talk, we investigate a general framework suitable for learning motor skills in robotics which is based on the principles behind many analytical robotics approaches. It involves generating a representation of motor skills by parameterized motor primitive policies acting as building blocks of movement generation, and a learned task execution module that transforms these movements into motor commands.

Learning parameterized motor primitives usually requires reward-related self-improvement, i.e., reinforcement learning. We propose a new, task-appropriate architecture, the Natural Actor-Critic. This algorithm is based on natural policy gradient updates for the actor while the critic estimates the natural policy gradient. Empirical evaluations illustrate the effectiveness and applicability to learning control on an anthropomorphic robot arm.

For the proper execution of motion, we need to learn how to realize the behavior prescribed by the motor primitives in their respective task space through the generation of motor commands. This transformation corresponds to solving the classical problem of operational space control through machine learning techniques. Such robot control problems can be reformulated as immediate reward reinforcement learning problems. We derive an EM-based reinforcement learning algorithm which reduces the problem of learning with immediate rewards to a reward-weighted regression problem. The resulting algorithm learns smoothly without dangerous jumps in solution space, and works well in application to complex high degree-of-freedom robots.

Date & Time
2007/11/09 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Jens Kohlmorgen (Fraunhofer FIRST, Germany)
Title
Real-Time Mental Workload Detection while Driving
Abstract
The ability to immediately detect mental overload in human operators is a vital demand for complex monitoring and control processes. Such processes can be found, for example, in industrial production lines and in aviation, but also in common every day tasks like driving. We here present an EEG-based system that is able to detect high mental workload in drivers while they are driving a car on the highway during the usual daytime traffic. The information is immediately utilized to mitigate the workload typically induced by the influx of information that is generated by the car's electronic systems. Two experimental paradigms were tested: an auditory workload scheme and a mental calculation task. While the detection performance turns out to be strongly subject-dependent, the results are good to excellent for the majority of subjects. We show that in these cases an induced mitigation of a reaction time experiment leads to an improved performance of the driver in that task. Example videos demonstrate the efficiency of this approach.

Date & Time
2007/11/02 13:30-15:00
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Nicole Krämer (Technical University of Berlin, Germany)
Title
Error Bars and Degrees of Freedom for Kernel Partial Least Squares
Abstract
Kernel Partial Least Squares (KPLS) is a supervised dimensionality reduction method that constructs orthogonal features with maximal covariance to the response variable(s). For prediction, the response is then projected onto these features. For the derivation of prediction intervals (on top of the usual point estimates), we need to determine an (approximate) distribution of the fitted function. As for KPLS, the distribution cannot be determined analytically, we propose an approximation in terms of a first order Taylor approximation of PLS. Following the same line, we also derive an unbiased estimate of the Degrees of Freedom of KPLS. This estimate can then be used for model selection.

Date & Time
2007/10/18 15:00-16:30
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Jean-Philippe Vert (Ecole des Mines, France)
Title
QSAR and virtual screening with support vector machines
Abstract
Support vector machines (SVM) are machine learning algorithms increasingly popular in many fields including chemoinformatics. They enjoy good performances on many real-world applications, and introduce a new framework to represent and compare the data to be processed, such as molecules: instead of an explicit representation of molecules as a set of features or a fingerprint, SVM only require the definition of a measure of similarity between molecules, called a kernel, that can in some cases be defined directly, without prior vectorization of the molecules. After a brief introduction to SVM and the notion of kernels, I will give several examples of kernels for molecules based on their 2D and 3D structures, and illustrate their relevance on toxicity prediction experiments.

Date & Time
2007/10/16 15:00-16:30
Venue
Seminar Room on 5th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Ryota Tomioka (University of Tokyo, Japan)
Title
Prediction over Matrices with Dual Spectral Regularization and EEG Classification
Abstract
Prediction over matrices arises naturally in many real world problems. It is a common prior belief that the discriminative information is concentrated in some low dimensional subspace. The dual spectral regularization expresses this induction bias in a convex optimization framework. In fact, the L1 nature of the reuglarization forces many singular values to be zero. This sparseness allows good interpretation of the solution. Moreover, we propose an efficient optimization algorithm based on interior-point method. The convex duality plays the key role in the implementation. We apply the logistic regression with dual spectral regularization to motor-imagery EEG classification problem in the context of Brain-Computer Interface (BCI). Classification results on 162 BCI datasets show significant improvement in the classification accuracy against l2-regularized logistic regression, rank=2 approximated logistic regression as well as Common Spatial Pattern (CSP) based classifier, which is a popular technique in BCI . Connections to LASSO, GP classification with a second order polynomial kernel, and SVM are discussed.

Date & Time
2007/7/25 14:00-15:30
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Liwei Wang (Peking University, China)
Title
On Learning with Dissimilarity Functions and Rademacher Margin Complexity
Abstract
Learning with dissimilarity functions is the problem of learning a classification task when only similarity information of the objects are given. This problem arises partly in image recognition where feature extraction is usually difficult but there are a number of image dissimilarity measures can be used. The first part of this talk devotes to the sufficient conditions of a dissimilarity functions to allow one building efficient learning algorithms. It turns out that the theory suggests a boosting type algorithm for which the base classifier is a special kind of decision stump. I will also discuss some modifications of to make the algorithm tractable. The experimental results are promising. The second part of this talk is an on going work called Rademacher Margin complexity. The goal work is to provide more powerful error bound analysis tools especially for dissimilarity based learning algorithms. I will pose two open problems on the Rademacher Margin Complexity. Finally I will discuss some possible future directions.

Date & Time
2007/7/17 15:00-16:30
Venue
Meeting Room on 10th Floor, W8E Building (Campus map, O-okayama Area, Building No.26)
Speaker
Klaus-Robert Müller (Technical University of Berlin, Germany)
Title
Machine Learning for Computational Chemistry
Abstract
This talk will first introduce standard kernel methods (SVM) and Gaussian Processes.An interesting application scenario is then discussed: in-silico modeling of chemical properties such as water solubility, toxicity, lipophilicity etc. Accurate in-silico models for predicting aqueous solubility are needed in drug design and discovery, and many other areas of chemical research. A first principles modeling of solubility, however, would be overly complex, since too many physical factors with separate mechanisms are involved in the phase transition from solid to solvated molecules. We present machine learning approaches that provide a statistical modeling of aqueous solubility based on measured data. The model was validated on the well known set of 1311 compounds by Huuskonen et.al., and on an in-house dataset of 632 drug candidates at Schering.On top of the excellent predictions, the proposed machine learning models also provide confidence estimates for each individual prediction.

Sugiyama Laboratory, Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology,
2-12-1-W8-74, O-okayama, Meguro-ku, Tokyo, 152-8552, Japan.
TEL & FAX: +81-3-5734-2699