Q. Why Called L* norm?
A. Origin of name is Lebesgue spaces.
$LAYYYTER
Cosimo Galluzzi

Janaina Medeiros
occasionally subtle

@theartofmadeline
NASA

#extradirty

shark vs the universe

pixel skylines

oozey mess
Lint Roller? I Barely Know Her
Xuebing Du
Sweet Seals For You, Always

⁂
Mike Driver
One Nice Bug Per Day
DEAR READER
Claire Keane
RMH
will byers stan first human second

seen from Germany

seen from Türkiye
seen from United States
seen from United States

seen from Malaysia
seen from United States

seen from Malaysia

seen from Mexico

seen from Malaysia
seen from United States
seen from Türkiye
seen from Türkiye
seen from Brazil

seen from Malaysia

seen from United Kingdom

seen from United States

seen from United States
seen from Indonesia
seen from Singapore

seen from Malaysia
@hurutoriya
Q. Why Called L* norm?
A. Origin of name is Lebesgue spaces.

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
Slackでファイルに対するコメント機能が添削するときに便利
添削するときに、便利。
Githubとかでプルリク出してもらってコメントするのも良いけど、スライドとかだとそれも難しいのでこの機能は良いなと思った。
Google Slideとか使えればコメント機能使えば更に便利そう。 分野的に数式を多用するので毎回beamerを使ってるのでお世話になることはなさそうなのが悲しい。
内積における線形性と共役線形性
内積関数 $< \circ ,\circ > : C \times C \to C $は線形性と共役線形性の2つの性質を持つ。
線形性(Linear)
$$< \alpha x_1 + \beta x_2,y > = \alpha < x_1 ,y> + \beta < x_2,y > $$
共役線形性(Conjugate Linear)
$$< x, \alpha y_1 + \beta y_2> = \bar{\alpha} < x,y_1 > + \bar{\beta} < x,y_2 > $$
共役線形性は反線形性( antilinear )とも呼ばれている。
個人的には反線形は誤解を招きやすいので共役線形性派です。
Matrix Analysis
posted with amazlet at 16.05.18
Cambridge University Press (2012-09-28)
Amazon.co.jpで詳細を見る
Matrix Analysisの輪講のメモ書き
How to get the Figure File from MATLAB Live Editor?
Hi there!
I had started MATLAB Liveeditor, which provid after MATLAB r2016a. Live Editor looks like a Jupyter notebook or Mathmatica notebook. This one means integrate a programming code and output.
But I face to one of Problem using MATLAB Live Editor.
Q. How to Get the Figure File in MATLAB Live Editor?
A. Double Click.
Don't need writing script.
figure; scatter(data(:,1),data(:,2),dotsize,data(:,3)); axis equal; title('outlier'); saveas(gcf,'./result/dataset','epsc')
*attention: maybe this working only after just run to Live Editor.
ICML2016 notes
Note:write about ICML2016 for me.
http://icml.cc/2016/?page_id=1649
Tutorials
Deep Reinforcement Learning David Silver (Google DeepMind)
A major goal of artificial intelligence is to create general-purpose agents that can perform effectively in a wide range of challenging tasks. To achieve this goal, it is necessary to combine reinforcement learning (RL) agents with powerful and flexible representations. The key idea of deep RL is to use neural networks to provide this representational power. In this tutorial we will present a family of algorithms in which deep neural networks are used for value functions, policies, or environment models. State-of-the-art results will be presented in a variety of domains, including Atari games, 3D navigation tasks, continuous control domains and the game of Go.
Graph Sketching, Streaming, and Space-Efficient Optimization Sudipto Guha (University of Pennsylvania) and Andrew McGregor (University of Massachusetts Amherst)
Graphs are one of the most commonly used data representation tools but existing algorithmicapproaches are typically not appropriate when the graphs of interest are dynamic, stochastic, ordo not fit into the memory of a single machine. Such graphs are often encountered as machinelearning techniques are increasingly deployed to manage graph data and large-scale graph opti-mization problems. Graph sketching is a form of dimensionality reduction for graph data that isbased on using random linear projections and exploiting connections between linear algebra andcombinatorial structure. The technique has been studied extensively over the last five years andcan be applied in many computational settings. It enables small-space online and data streamcomputation where we are permitted only a few passes (ideally only one) over an input sequence ofupdates to a large underlying graph. The technique parallelizes easily and can naturally be appliedin various distributed settings. It can also be used in the context of convex programming to enablemore efficient algorithms for combinatorial optimization problems such as correlation clustering.One of the main goals of the research on graph sketching is understanding and characterizing thetypes of graph structure and features that can be inferred from compressed representations of therelevant graphs.
Workshop
Abstraction in Reinforcement Learning Daniel Mankowitz, Shie Mannor (Technion Israel Institute of Technology), Timothy Mann (Google Deepmind)
Many real-world domains can be modeled using some form of abstraction. An abstraction is an important tool that enables an agent to focus less on the lower level details of a task and more on solving the task at hand. Temporal abstraction (i.e., options or skills) as well as spatial abstraction (i.e., state space representation) are two important examples. The goal of this workshop is to provide a forum to discuss the current challenges in designing as well as learning abstractions in real-world Reinforcement Learning (RL). http://rlabstraction2016.wix.com/icml
Neural Networks Back To The Future
Léon Bottou (Facebook), David Grangier (Facebook), Tomas Mikolov (Facebook), John Platt (Google)
As research in deep learning is extremely active today, we could take a step back and examine its foundations. We propose to have a critical look at previous work on neural networks, and try to have a better understanding of the differences with today’s work. Previous work can point at promising directions to follow, pitfalls to avoid, ideas and assumptions to revisit. Similarly, today’s progress can allow a critical examination of what should still be investigated, what has been answered… https://sites.google.com/site/nnb2tf
Accepted Papers
Graph embeddings, Clustering周りで面白そうなの集めた
Revisiting Semi-Supervised Learning with Graph Embeddings ,Zhilin Yang Carnegie Mellon University, Ruslan Salakhudinov U. of Toronto, William Cohen CMU
We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph. We develop both transductive and inductive variants of our method. In the transductive variant of our method, the class labels are determined by both the learned embeddings and input feature vectors, while in the inductive variant, the embeddings are defined as a parametric function of the feature vectors, so predictions can be made on instances not seen during training. On a large and diverse set of benchmark tasks, including text classification, distantly supervised entity extraction, and entity classification, we show improved performance over many of the existing models.
Compressive Spectral Clustering ,Nicolas TREMBLAY INRIA Rennes, Gilles Puy INRIA, Remi Gribonval INRIA, Pierre Vandergheynst EPFL
Spectral clustering has become a popular technique due to its high performance in many contexts. It comprises three main steps: create a similarity graph between N objects to cluster, compute the first k eigenvectors of its Laplacian matrix to define a feature vector for each object, and run k-means on these features to separate objects into k classes. Each of these three steps becomes computationally intensive for large N and/or k. We propose to speed up the last two steps based on recent results in the emerging field of graph signal processing: graph filtering of random signals, and random sampling of bandlimited graph signals. We prove that our method, with a gain in computation time that can reach several orders of magnitude, is in fact an approximation of spectral clustering, for which we are able to control the error. We test the performance of our method on artificial and real-world network data.
k-variates++: more pluses in the k-means++
kk-means++ seeding has become a de facto standard for hard clustering algorithms. In this paper, our first contribution is a two-way generalisation of this seeding, kk-variates++, that includes the sampling of general densities rather than just a discrete set of Dirac densities anchored at the point locations, \textit{and} a generalisation of the well known Arthur-Vassilvitskii (AV) approximation guarantee, in the form of a \textit{bias+variance} approximation bound of the \textit{global} optimum. This approximation exhibits a reduced dependency on the “noise” component with respect to the optimal potential — actually approaching the statistical lower bound. We show that kk-variates++ \textit{reduces} to efficient (biased seeding) clustering algorithms tailored to specific frameworks; these include distributed, streaming and on-line clustering, with \textit{direct} approximation results for these algorithms. Finally, we present a novel application of kk-variates++ to differential privacy. For either the specific frameworks considered here, or for the differential privacy setting, there is little to no prior results on the direct application of kk-means++ and its approximation bounds — state of the art contenders appear to be significantly more complex and / or display less favorable (approximation) properties. We stress that our algorithms can still be run in cases where there is \textit{no} closed form solution for the population minimizer. We demonstrate the applicability of our analysis via experimental evaluation on several domains and settings, displaying competitive performances vs state of the art.
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy ,Nathan Dowlin Princeton, Ran , Kim Laine Microsoft Research, Kristin Lauter Microsoft Research, Michael Naehrig Microsoft Research, John Wernsing Microsoft Research
Applying machine learning to a problem which involves medical, financial, or other types of sensitive data, not only requires accurate predictions but also careful attention to maintaining data privacy and security. Legal and ethical requirements may prevent the use of cloud-based machine learning solutions for such tasks. In this work, we will present a method to convert learned neural networks to CryptoNets, neural networks that can be applied to encrypted data. This allows a data owner to send their data in an encrypted form to a cloud service that hosts the network. The encryption ensures that the data remains confidential since the cloud does not have access to the keys needed to decrypt it. Nevertheless, we will show that the cloud service is capable of applying the neural network to the encrypted data to make encrypted predictions, and also return them in encrypted form. These encrypted predictions can be sent back to the owner of the secret key who can decrypt them. Therefore, the cloud service does not gain any information about the raw data nor about the prediction it made. We demonstrate CryptoNets on the MNIST optical character recognition tasks. CryptoNets achieve 99% accuracy and can make more than 51000 predictions per hour on a single PC. Therefore, they allow high throughput, accurate, and private predictions.
The Variational Nystrom method for large-scale spectral problems ,Max Vladymyrov Yahoo Labs, Miguel Carreira-Perpinan UC Merced
Spectral methods for dimensionality reduction and clustering require solving an eigenproblem defined by a sparse affinity matrix. When this matrix is large, one seeks an approximate solution. The standard way to do this is the Nystrom method, which first solves a small eigenproblem considering only a subset of landmark points, and then applies an out-of-sample formula to extrapolate the solution to the entire dataset. We show that by constraining the original problem to satisfy the Nystrom formula, we obtain an approximation that is computationally simple and efficient, but achieves a lower approximation error using fewer landmarks and less runtime. We also study the role of normalization in the computational cost and quality of the resulting solution.
Fast Stochastic Algorithms for SVD and PCA: Convergence Properties and Convexity ,Ohad Shamir Weizmann Institute of Science
We study the convergence properties of the VR-PCA algorithm introduced by (Shamir, 2015) for fast computation of leading singular vectors. We prove several new results, including a formal analysis of a block version of the algorithm, and convergence from random initialization. We also make a few observations of independent interest, such as how pre-initializing with just a single exact power iteration can significantly improve the runtime of stochastic methods, and what are the convexity and non-convexity properties of the underlying optimization problem.
Convergence of Stochastic Gradient Descent for PCA ,Ohad Shamir Weizmann Institute of Science
We consider the problem of principal component analysis (PCA) in a streaming stochastic setting, where our goal is to find a direction of approximate maximal variance, based on a stream of i.i.d. data points in R^d. A simple and computationally cheap algorithm for this is stochastic gradient descent (SGD), which incrementally updates its estimate based on each new data point. However, due to the non-convex nature of the problem, analyzing its performance has been a challenge. In particular, existing guarantees rely on a non-trivial eigengap assumption on the covariance matrix, which is intuitively unnecessary. In this paper, we provide (to the best of our knowledge) the first eigengap-free convergence guarantees for SGD in the context of PCA. This also partially resolves an open problem posed in \cite{hardt2014noisy}. Moreover, under an eigengap assumption, we show that the same techniques lead to new SGD convergence guarantees with better dependence on the eigengap.
Low-Rank Matrix Approximation with Stability ,Dongsheng Li IBM Research – China, Chao Chen , Qin Lv , Junchi Yan , Li Shang , Stephen Chu
Low-rank matrix approximation has been widely adopted in machine learning applications with sparse data, such as recommender systems. However, the sparsity of the data, incomplete and noisy, introduces challenges to the algorithm stability — small changes in the training data may significantly change the models. As a result, existing low-rank matrix approximation solutions yield low generalization performance, exhibiting high error variance on the training dataset, and minimizing the training error may not guarantee error reduction on the testing dataset. In this paper, we investigate the algorithm stability problem of low-rank matrix approximations. We present a new algorithm design framework, which (1) introduces new optimization objectives to guide stable matrix approximation algorithm design, and (2) solves the optimization problem to obtain stable low-rank approximation solutions with good generalization performance. Experimental results on real-world datasets demonstrate that the proposed work can achieve better prediction accuracy compared with both state-of-the-art low-rank matrix approximation methods and ensemble methods in recommendation task.
Fast Methods for Estimating the Numerical Rank of Large Matrices ,Shashanka Ubaru University of Minnesota, Yousef Saad University of Minnesota
We present two computationally inexpensive techniques for estimating the numerical rank of a matrix, combining powerful tools from computational linear algebra. These techniques exploit three key ingredients. The first is to approximate the projector on the non-null invariant subspace of the matrix by using a polynomial filter. Two types of filters are discussed, one based on Hermite interpolation and the other based on Chebyshev expansions. The second ingredient employs stochastic trace estimators to compute the rank of this wanted eigen-projector, which yields the desired rank of the matrix. In order to obtain a good filter, it is necessary to detect a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that correspond to the non-null invariant subspace. The third ingredient of the proposed approaches exploits the idea of spectral density, popular in physics, and the Lanczos spectroscopic method to locate this gap.
Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing ,Ke Li UC Berkeley, Jitendra Malik
Existing methods for retrieving k-nearest neighbours suffer from the curse of dimensionality. We argue this is caused in part by inherent deficiencies of space partitioning, which is the underlying strategy used by almost all existing methods. We devise a new strategy that avoids partitioning the vector space and present a novel randomized algorithm that runs in time linear in dimensionality and sub-linear in the size of the dataset and takes space constant in dimensionality and linear in the size of the dataset. The proposed algorithm allows fine-grained control over accuracy and speed on a per-query basis, automatically adapts to variations in dataset density, supports dynamic updates to the dataset and is easy-to-implement. We show appealing theoretical properties and demonstrate empirically that the proposed algorithm outperforms locality-sensitivity hashing (LSH) in terms of approximation quality, speed and space efficiency.
Community Recovery in Graphs with Locality ,Yuxin Chen Stanford University, Govinda Kamath Stanford University, Changho Suh KAIST, David Tse Stanford University
Motivated by applications in domains such as social networks and computational biology, we study the problem of community recovery in graphs with locality. In this problem, pairwise noisy measurements of whether two nodes are in the same community or different communities come mainly or exclusively from nearby nodes rather than uniformly sampled between all nodes pairs, as in most existing models. We present an algorithm that runs linearly in the number of measurements and which achieves the information theoretic limit for exact recovery.
Fast k-means with accurate bounds ,James Newling Idiap Research Institute, Francois Fleuret Idiap research institute
Our first contribution is an accelerated exact k-means algorithm, which performs better than the current state-of-the-art low-dimensional algorithm in 18 of 22 experiments, running up to 3x faster. Our second contribution is a general improvement of existing state-of-the-art accelerated exact k-means algorithms, which relies on better estimates of distance bounds used to reduce the required number of distance calculations, and leads to a speedup in 36 of 44 experiments, running up to 1.8x faster. We have conducted experiments with our own implementations of existing state-of-the-art methods to ensure homogeneous evaluation of performance, and we show that these new implementations perform as well or better than existing available implementations. We also propose simplified variants of standard approaches and show that they are faster than their fully-fledged counterparts in 59 of 62 experiments.
KK-Means Clustering with Distributed Dimensions ,Hu Ding State University of New York at Buffalo, Lingxiao Huang , Yu Liu Tsinghua University, Jian Li
Distributed clustering has attracted significant attention in recent years. In this paper, we study the kk-means problem in the distributed dimension setting, where the dimensions of the data are partitioned across multiple machines. We provide new approximation algorithms, which incur low communication costs and achieve constant approximation ratios. The communication complexity of our algorithms significantly improve on existing algorithms. We also provide the first communication lower bound, which nearly matches our upper bound in a certain range of parameter setting. Our experimental results show that our algorithms outperform existing algorithms on real data-sets in the distributed dimension setting.
Non-negative Matrix Factorization under Heavy Noise ,Jagdeep Pani Indian Institute of Science, Ravindran Kannan Microsoft Reseach India, Chiranjib Bhattacharya , Navin Goyal Microsoft Research India
The Noisy Non-negative Matrix factorization (NMF) is: given a data matrix A (d x n), find non-negative matrices B;C (d x k, k x n respy.) so that A = BC +N, where N is a noise matrix. Existing polynomial time algorithms with proven error guarantees require EACH column N_.j to have l1 norm much smaller than ||(BC).j ||_1, which could be very restrictive. In important applications of NMF such as Topic Modeling as well as theoretical noise models (e.g. Gaussian with high sigma), almost EVERY column of N.j violates this condition. We introduce the heavy noise model which only requires the average noise over large subsets of columns to be small. We initiate a study of Noisy NMF under the heavy noise model. We show that our noise model subsumes noise models of theoretical and practical interest (for e.g. Gaussian noise of maximum possible sigma). We then devise an algorithm TSVDNMF which under certain assumptions on B,C, solves the problem under heavy noise. Our error guarantees match those of previous algorithms. Our running time of O(k.(d+n)^2) is substantially better than the O(d.n^3) for the previous best. Our assumption on B is weaker than the “Separability” assumption made by all previous results. We provide empirical justification for our assumptions on C. We also provide the first proof of identifiability (uniqueness of B) for noisy NMF which is not based on separability and does not use hard to check geometric conditions. Our algorithm outperforms earlier polynomial time algorithms both in time and error, particularly in the presence of high noise.
Autoencoding beyond pixels using a learned similarity metric ,Anders B. L. Larsen Technical University of Denmar, Søren Kaae Sønderby University of Copenagen, Hugo Larochelle Twitter, Ole Winther Technical University of Denmark
We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder with a generative adversarial network we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic.
Matrix Eigendecomposition via Doubly Stochastic Riemannian Optimization ,Zhiqiang Xu Institute of Infocomm Research, Peilin Zhao I2R
Matrix eigendecomposition plays a significant role in machine learning, which is a non-convex quadratic optimization problem. In this paper, we propose a doubly stochastic Riemannian gradient DSRG method to address this problem. Specifically, the double stochasticity comes from the Riemannian counterparts of stochastic Euclidean gradient ascent and stochastic Euclidean coordinate ascent. With the double stochasticity, DSRG can greatly reduce the complexity, and completely avoid the matrix inversion, so that it is very suitable to large matrices eigendecomposition. We theoretically analyze its convergence properties and empirically validate it on real-world datasets. Encouraging experimental results demonstrate its advantages over the non-stochastic counterpart.
Persistence weighted Gaussian kernel for topological data analysis ,Genki Kusano Tohoku University, Yasuaki Hiraoka , Kenji
Topological data analysis (TDA) is an emerging mathematical concept for characterizing shapes in complex data. In TDA, persistence diagrams are widely recognized as a useful descriptor of data, and can distinguish robust and noisy topological properties. This paper proposes a kernel method on persistence diagrams to develop a statistical framework in TDA. The proposed kernel satisfies the stability property and provides explicit control on the effect of persistence. Furthermore, the method allows a fast approximation technique. The method is applied into practical data on proteins and oxide glasses, and the results show the advantage of our method compared to other relevant methods on persistence diagrams.
Clustering is a powerful tool in data analysis, but it is often difficult to find a grouping that aligns with a user’s needs. To address this, several methods incorporate constraints obtained from users into clustering algorithms, but unfortunately do not apply to hierarchical clustering. We design an interactive Bayesian algorithm that incorporates user interaction into hierarchical clustering while still utilizing the geometry of the data by sampling a constrained posterior distribution over hierarchies. We also suggest several ways to intelligently query a user. The algorithm, along with the querying schemes, shows promising results on real data.
Interactive Bayesian Hierarchical Clustering ,Sharad Vikram UCSD, Sanjoy Dasgupta UCSD
Clustering is a powerful tool in data analysis, but it is often difficult to find a grouping that aligns with a user’s needs. To address this, several methods incorporate constraints obtained from users into clustering algorithms, but unfortunately do not apply to hierarchical clustering. We design an interactive Bayesian algorithm that incorporates user interaction into hierarchical clustering while still utilizing the geometry of the data by sampling a constrained posterior distribution over hierarchies. We also suggest several ways to intelligently query a user. The algorithm, along with the querying schemes, shows promising results on real data.
Cross-graph Learning of Multi-relational Associations ,Hanxiao Liu , Yiming Yan
Cross-graph Relational Learning (CGRL) refers to the problem of predicting the strengths or labels of multi-relational tuples of heterogeneous object types, through the joint inference over multiple graphs which specify the internal connections among each type of objects. CGRL is an open challenge in machine learning due to the daunting number of all possible tuples to deal with when the numbers of nodes in multiple graphs are large, and because the labeled training instances are extremely sparse as typical. Existing methods such as tensor factorization or tensor-kernel machines do not work well because of the lack of convex formulation for the optimization of CGRL models, the poor scalability of the algorithms in handling combinatorial numbers of tuples, and/or the non-transductive nature of the learning methods which limits their ability to leverage unlabeled data in training. This paper proposes a novel framework which formulates CGRL as a convex optimization problem, enables transductive learning using both labeled and unlabeled tuples, and offers a scalable algorithm that guarantees the optimal solution and enjoys a constant time complexity with respect to the sizes of input graphs. In our experiments with a subset of DBLP publication records and an Enzyme multi-source dataset, the proposed method successfully scaled to the large cross-graph inference problem, and outperformed other representative approaches significantly.
Speeding up k-means by approximating Euclidean distances via block vectors ,Thomas Bottesch Ulm University, Thomas Bühler Avira, Markus Kächele Ulm University
This paper introduces a new method to approximate Euclidean distances between points using block vectors in combination with the Holder inequality. By defining lower bounds based on the proposed approximation, cluster algorithms can be considerably accelerated without loss of quality. In extensive experiments, we show a considerable reduction in terms of computational time in comparison to standard methods and the recently proposed Yinyang k-means. Additionally we show that the memory consumption of the presented clustering algorithm does not depend on the number of clusters, which makes the approach suitable for large scale problems.
Faster Eigenvector Computation via Shift-and-Invert Preconditioning ,Dan Garber TTI Chicago, Elad Hazan Princeton University, Chi Jin UC Berkeley, Sham , Cameron Musco Massachusetts Institute of Technology, Praneeth Netrapalli Microsoft Research, Aaron Sidford Microsoft Research
We give faster algorithms and improved sample complexities for the fundamental problem of estimating the top eigenvector of a given matrix. In particular, given an explicit matrix A∈\Rn×dA∈\Rn×d such that Σ=A⊤AΣ=A⊤A, we show how to compute an ϵϵ approximate top eigenvector in time \otilde([\nnz(A)+d\sr(A)\gap2]⋅log1/ϵ)\otilde([\nnz(A)+d\sr(A)\gap2]⋅log1/ϵ). Here \nnz(A)\nnz(A) is the number of nonzeros in AA, \sr(A)\sr(A) is the stable rank, and \gap\gap is the relative eigengap. We also consider an online setting in which, given a stream of i.i.d. samples from a distribution DD with covariance matrix ΣΣ and a vector x0x0 which is an O(\gap)O(\gap) approximate top eigenvector for ΣΣ, we show how to refine x0x0 to an ϵϵ approximation using O(\var(D)\gap⋅ϵ)O(\var(D)\gap⋅ϵ) samples from DD. Here \var(D)\var(D) is a natural notion of variance. Combining our algorithm with previous work to initialize x0x0, we obtain improved sample complexity and runtime results under a variety of assumptions on \D\D. Notably, we show that, for general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. We achieve our results by combining a new robust analysis of the classic method of \emph{shift-and-invert} preconditioning, which reduces eigenvector computations to \emph{approximately} solving a sequence of linear systems, with fast stochastic gradient based system solvers to achieve our claims.
Efficient Algorithms for Large-scale Generalized Eigenvector Computation and CCA ,Rong Ge , Chi Jin UC Berkeley, Sham , Praneeth Netrapalli Microsoft Research, Aaron Sidford Microsoft Research
This paper considers the problem of canonical-correlation analysis (CCA) and, more broadly, the generalized eigenvector problem for a pair of symmetric matrices. These are two fundamental problems in data analysis and scientific computing with numerous applications in machine learning and statistics. We provide simple iterative algorithms, with improved runtimes, for solving these problems that are globally linearly convergent with moderate dependencies on the condition numbers and eigenvalue gaps of the matrices involved. We obtain our results by reducing CCA to the top-kk generalized eigenvector problem. We solve this problem through a general framework that simply requires black box access to an approximate linear system solver. Instantiating this framework with accelerated gradient descent we obtain a running time of \orderzkκ√ρlog(1/ϵ)log(kκ/ρ)\orderzkκρlog(1/ϵ)log(kκ/ρ) where zz is the total number of nonzero entries, κκ is the condition number and ρρ is the relative eigenvalue gap of the appropriate matrices. Our algorithm is linear in the input size and the number of components kk. This is essential for handling large-scale matrices that appear in practice. To the best of our knowledge this is the first such algorithm with global linear convergence. We hope that our results prompt further research and ultimately improve the practical running time for performing these important data analysis procedures on large data sets.

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
ICDE2016 notes
Note:write about SDM2015 for me.
http://icde2016.fi/papers.php#tabular1
Streaming Spectral Clustering Shinjae Yoo*, Hao Huang and Shiva Kasiviswanathan.
pSCAN: Fast and Exact Structural Graph Clustering Lijun Chang*, Wei Li, Xuemin Lin, Lu Qin and Wenjie Zhang.
BlackHole: Robust Community Detection Inspired by Graph Drawing Sungsu Lim, Junghoon Kim and Jae-Gil Lee*.
SDM2015 notes
Note:write about SDM2015 for me.
http://www.siam.org/meetings/sdm15/
Oral data
Poster Data
I had found acceptance rate on SDM12. acceptance rate of SDM12 is 15%.
T. Zhou, H. Shan, A. Banerjee, G. Sapiro. Kernelized Probabilistic Matrix Factorization. SIAM International Conference on Data Mining (SDM), Anaheim, U.S., 2012. (Acceptance rate: 15%) [pdf] [code]
http://www-users.cs.umn.edu/~shan/publication.html
2016/05/11 TurtrialsとWorkshopsを追加
SDM15: Call for Tutorials
Information Theory in Data Mining
Presenters: Matthijs van Leeuwen, Katholieke Universiteit Leuven, Belgium; Arno Siebes, Universiteit Utrecht, Netherlands; Jilles Vreeken, Max-Planck Institute for Informatics, Germany and Saarland University, Germany Tutorial Website: http://usefulpatterns.org/itdm/
Call For Workshops
Mining Networks and Graphs: A Big Data Analytic Challenge Organizers: Larry Holder, Maleq Khan, Christine Klymko
http://staff.vbi.vt.edu/maleq/MNG2015/
Oral
Clustering and Ranking in Heterogeneous Information Networks via Gamma‐Poisson Model Junxiang Chen*, Northeastern University; Wei Dai, Northeastern University; Yizhou Sun, Northeastern University; Jennifer Dy, Northeastern University
Community Detection for Emerging NetworksJiawei Zhang*, UIC; Philip S. Yu, University of Illinois at Chicago
Efficient Algorithms for a Robust Modularity‐Driven Clustering of Attributed Graphs Patricia Iglesias Sanchez*, KIT; Emmanuel Müller, Karlsruhe Institute of Technology (KIT); Uwe Leo Korn, KIT; Klemens Boehm, KIT; Andrea Kappes, KIT; Tanja Hartmann, ; Dorothea Wagner, KIT
Factorized bilinear similarity for cold‐start item recommendations Mohit Sharma*, University of minnesota; George Karypis, U. Minnesota; Jiayu Zhou, Samsung Research; Junling Hu, Samsung Research America
Fast Eigen‐Functions Tracking on Dynamic Graphs Chen Chen*, Arizona State University; HanghangTong , Arizona State University
Low Rank Representation on Riemannian Manifold of Symmetric Positive Definite Matrices Yifan Fu*, CSU; Junbin Gao, CSU; Xia Hong, University of Reading; David Tien, CSU
Spectral Embedding of Signed Networks David Skillicorn*, Queen's University; Quan Zheng, Queen's University
Tensor Spectral Clustering for Partitioning Higher‐order Network Structures Austin Benson*, Stanford University; David Gleich, Purdue University; Jure Leskovec, Stanford
Vertex Clustering of Augmented Graph Streams Ryan McConville*, Queen's University Belfast; Weiru Liu, Queen's University Belfast; Paul Miller, Queen's University Belfast
Where Graph Topology Matters: The Robust Subgraph Problem Hau Chan, Stony Brook University; Shuchu Han, Stony Brook University; Leman Akoglu*, SUNY Stony Brook
Poster
I had never find interesting poster; and number of poster session is small.
I want to watch a stastical data on SDM2015. Where is acceptance rate?
CVPR2016 notes
Note: write about CVPR2016 for me.
http://cvpr2016.thecvf.com/program/
Accept Rates
This year, we received a record 2145 valid submissions to the main conference, of which 1865 were fully reviewed (the others were either administratively rejected for technical or ethical reasons or withdrawn before review). A total of 643 papers were accepted (29.9% of valid submissions). 83 of these were allocated long oral presentations (3.9% of valid submissions) and 123 were allocated short spotlight presentations (total of 9.7% of valid submissions have live presentations). All papers will appear also in the interactive session where posters are presented.
Turtrial
6/26/16 (full day) Low-Rank and Sparse Modeling for Visual Analytics Rene Vidal, Ehsan Elhamifar, Zhouchen Lin, Jiashi Feng, Sheng Li, Yun Fu
6/26/16 (PM only) Computer Vision and Applied Deep Learning with Mathematica Sebastian Bodenstein, Matthias Odisio
6/26/16 (PM only) Mathematics of Deep Learning Rene vidal, Guillermo Sapiro, Joan Bruna, Benjamin Haeffele, Raja Giryes
6/26/16 (PM only) Recent Image Search Techniques Sungeui Yoon and Zhe Lin
Oral
2 Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
3 You Only Look Once: Unified, Real-Time Object Detection. Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
13 Unsupervised Learning of Edges. Yin Li, Manohar Paluri, James M. Rehg, Piotr Dollár
15 Deeply-Recursive Convolutional Network for Image Super-Resolution. Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee
16 Accurate Image Super-Resolution Using Very Deep Convolutional Networks. Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee
15 Self-Adaptive Matrix Completion for Heart Rate Estimation From Face Videos Under Realistic Conditions. Sergey Tulyakov, Xavier Alameda-Pineda, Elisa Ricci, Lijun Yin, Jeffrey F. Cohn, Nicu Sebe
4 Personalizing Human Video Pose Estimation. James Charles, Tomas Pfister, Derek Magee, David Hogg, Andrew Zisserman
14 Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit. Chong You, Daniel Robinson, René Vidal
Sporlight Session
19 Affinity CNN: Learning Pixel-Centric Pairwise Relations for Figure/Ground Embedding. Michael Maire, Takuya Narihira, Stella X. Yu
21 Groupwise Tracking of Crowded Similar-Appearance Targets From Low-Continuity Image Sequences. Hongkai Yu, Youjie Zhou, Jeff Simmons, Craig P. Przybyla, Yuewei Lin, Xiaochuan Fan, Yang Mi, Song Wang
23 What Players Do With the Ball: A Physically Constrained Interaction Modeling. Andrii Maksai, Xinchao Wang, Pascal Fua
6 We Are Humor Beings: Understanding and Predicting Visual Humor. Arjun Chandrasekaran, Ashwin K. Vijayakumar, Stanislaw Antol, Mohit Bansal, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh
7 Where to Look: Focus Regions for Visual Question Answering. Kevin J. Shih, Saurabh Singh, Derek Hoiem
6 Simultaneous Clustering and Model Selection for Tensor Affinities. Zhuwen Li, Shuoguang Yang, Loong-Fah Cheong, Kim-Chuan Toh
7 Discriminatively Embedded K-Means for Multi-View Clustering. Jinglin Xu, Junwei Han, Feiping Nie
Poster
40 Pairwise Matching Through Max-Weight Bipartite Belief Propagation. Zhen Zhang, Qinfeng Shi, Julian McAuley, Wei Wei, Yanning Zhang, Anton van den Hengel
26 Video2GIF: Automatic Generation of Animated GIFs From Video. Michael Gygli, Yale Song, Liangliang Cao
37 SketchNet: Sketch Classification With Web Images. Hua Zhang, Si Liu, Changqing Zhang, Wenqi Ren, Rui Wang, Xiaochun Cao
49 Similarity Learning With Spatial Constraints for Person Re-Identification. Dapeng Chen, Zejian Yuan, Badong Chen, Nanning Zheng
62 Efficient Large-Scale Similarity Search Using Matrix Factorization. Ahmet Iscen, Michael Rabbat, Teddy Furon
73 Eye Tracking for Everyone. Kyle Krafka, Aditya Khosla, Petr Kellnhofer, Suchendra Bhandarkar, Wojciech Matusik, Antonio Torralba
78 iLab-20M: A Large-Scale Controlled Object Dataset to Investigate Deep Learning. Ali Borji, Saeed Izadi, Laurent Itti
68 Iterative Instance Segmentation. Ke Li, Bharath Hariharan, Jitendra Malik
MATLAB R2016aから提供されるLive Editor機能が革命的 ~完全にMATLAB版 Jupyter Notebook~
先日MATLABR2016aがリリースされて新機能を見てみると中々魅力的だったので、早速インストールしてみた。
そして見返したらImplementationのスペルを間違えているというね。。。。
Live Editor
一言でまとめると、まんまJupyter NotebookもしくはMathmatica notebookですね。
MATLAB Live Editor
以下の実行方法を示すライブ スクリプトをダウンロードできます。 Live Editor を使用してライブ スクリプトを編集、実行および保存する 出力とその出力を生成したコードを一緒に表示する セクションを個別に作成して実行する 書式付きテキスト、方程式、イメージ、ハイパーリンクを追加して共有可能な対話型の表現を作成する
他のMATLAB R2016aの新機能はこちら
実行方法
edit ***.mlx
上記のコマンドでLive Editorを使用できるスクリプトが作られます。 今までもLatexのコードやHTMLを埋め込んだりできたのですが、Live Editorの機能によって更に便利になりました。
利点
出力とスクリプトが対応付けられるので帰納的にプログラムを組むときにとても便利。
自分の用途としてもMATLABはサクッと確認したい時に使うことが多いので今回は本当に良いアップデートだなと思う。
出力と入力が対応していて、文字のカーソルや出力をクリックするとコード、出力間をジャンプする
*.mlxファイルはPDF,HTMLファイルへ出力可能なので、実験レポートがめちゃくちゃ楽になる。(数式もかける)
惜しいところ
ショートカットコマンドが機能しない
[2016/05/20 追記] ショートカットキー ⌘+↩ (Command + Enter) でLiveEditorの実行が可能。
テキスト埋め込みで日本語が対応してない
PRML勉強会第11回
パターン認識と機械学習の勉強会 第11回を開催してきました。
PRML勉強会 #11
論文紹介
Dhillon, Inderjit S., Yuqiang Guan, and Brian Kulis. "Kernel k-means: spectral clustering and normalized cuts." Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004.
発表者 : @huturoriya
Kernel k-meansとSpectral Clusteringの理論的関連性を証明。 詳細は以下の記事を御覧ください。
“Kernel k-means: spectral clustering and normalized cuts(KDD04)” 読んだ
Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization
発表者 : @HiroyoshiIto
異なるデータセットの、同一にトピック、相異なるトピックを検知する手法を提案。ワードクラウドが綺麗。
Show Notes
以下話題になった記事です。
Add Reactions to Pull Requests, Issues, and Comments
Rebuild.fm で知ったネタ。issueでの会話の返しが簡単になった
カーネルとは直感的に説明するとなんなのか?
Simultaneous Discovery of Common and Discriminative Topics via Joint Nonnegative Matrix Factorization
著者自身の紹介論文の発表スライド。トピックによるワードクラウドの可視化がわかりやすい。
mkdocs
勉強会のサイトをmkdocsで作りなおした。マークダウンだけで管理できるのでとても簡単。

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
筑波大学に入学した人がまずするべき3つのこと
筑波大学学年暦 | Google Calendar
Google Calendarに筑波大学の学年暦が表示されるようになる
Twincal
Twinsの履修情報から、Googleカレンダー・iCalへ時間割を表示するWebサービス
Google Calendarから時間割を確認できます。便利な時代になった。
えりたんBOTのアプリ
iPhone、Androidともにアプリがあるのでインストールしましょう。筑波大学の地図、バスと電車の情報が把握できます。
地図機能が特に秀逸でGoogle Mapの上に筑波大学の地図がかぶせて表示されているので、新入生が「え?中央会館ってどこ? 3B棟って?」と広大な筑波大学でも迷わずに生活することができます。
Macで高額アプリを間違って購入してしまったけど、Appleが払い戻しをしてくれたお話
起きがけにメールの確認をしていると...
見に覚えのないAppStoreの中でも最高額に近いアプリの購入が。。。
動画編集するつもりも無いので
焦って対処法を探してみると下記の記事が見つかり焦燥感が募る...
Mac App Storeで間違って購入 返品可能? 下記リンク先に「ご購入の取消は致しかねます。 本サービスを通じて提供される商品の価格はいつでも変更することがあり、値下げもしくは販売促進価格の場合でも価格保護または返金を行いません。」 との事です。 http://www.apple.com/legal/itunes/jp/terms.html
日本語の記事だと新しい情報が無いので、英語で探してみると対処法がありました。 How to Return Apps Purchased from the Mac App Store
AppleでAppを購入すると必ずApple IDのメールへ購入メールが送信されると思いますが、送られてきたメールの問題を報告するというリンクを押すと返金処理を申請することができます。
対象とするAppの問題を報告するボタンをクリックし、該当する問題を選択して送信します。 (自分の画面だと既に返金処理がされた後です)
その後Apple IDで登録しているメール宛に返金処理の対応がされます。 無事返金されました。
クレジットカードを使用しているので、払い戻しが来月になるそうですが、無事払い戻しが許諾されました。
まとめ
とても焦りましたが、MacAppStoreから無事払い戻し成功しました。
Link
How to Return Apps Purchased from the Mac App Store
Mac App Storeで間違って購入 返品可能?
Github Desktopに絵文字サジェスト機能が追加されてた
最近Github Desktopに絵文字サジェスト機能が実装された事に気がついた。
コミットメッセージに絵文字を使うようにしているのでサジェスト機能がとてもありがたい。
Githubでコミットメッセージと説明に絵文字使うと華やかになってよい
蛇足: Githubにコメントに絵文字リアクション機能が実装されたのも良いですね。
Add Reactions to Pull Requests, Issues, and Comments
Github Blogから引用
Link
GitHubで絵文字コミットを続けてみて有用だったEmojiまとめ
GitHub で絵文字入りコミットメッセージを活用しているプロジェクトを調べてみた
p1ch-jp/emoji-commit-message-guideline.md
"Kernel k-means: spectral clustering and normalized cuts(KDD04)” 読んだ
http://dl.acm.org/citation.cfm?id=1014118
Author
Inderjit S. Dhillon
https://www.cs.utexas.edu/~inderjit/
Gottesman Family Centennial Professor Director, Center for Big Data Analytics Department of Computer Science University of Texas at Austin
この論文の発展させた物がある。
A Unified View of Kernel k-means, Spectral Clustering and Graph Cuts
Kernel とグラフ分割の研究が面白そう
Yuqiang Guan
https://scholar.google.com/citations?user=HpWc7m4AAAAJ&hl=en
webサイトが無かった..
Note:Weighted graph cuts without eigenvectors a multilevel approach 固有ベクトルを使わないアプローチのグラフカット
Brian Kulis
Peter J. Levine Career Development Assistant Professor
Note: Power-Law Graph Cuts(2014) 読んでみよう
Motivation
Reseatch Question
Kernel k-meansとSpectral Clusteringの理論的繋がりによる関係性を明確にする。
Kernel k-means objective function and Spectral Clusteirng ofject of normalized cut.
a)固有値計算の計算コストはボッタクリでn-cutの最小化に関して本質的ではない
b)様々な手法が kernel k-meansの高速化、改善を行うことができる
Related work
Proposed Method
k-means:非線形分離が出来ない
kernel k-means ; 非線形関数を用いて高次元特徴空間へ写像→その空間上にてk-meansを行う
spectral clusteirng:類似度行列の固有ベクトルからクラスタリングを行う(normalized cutの最小化を近似する)
kernel k-meansの優位点:任意のkernelと重みに対して一般化が可能
特定の重みを計算するとkernel k-meansとspectral clusteringの目的関数は同一になる。
大規模行列に対する多数の固有ベクトルを求める固有値計算(Lanczos mehod)は計算量が無視で居ないほどが大きい
kernek k-means: k-meansをベースにした反復法によるn-cutの最小化を提案
Kernel:$M \times M \to R$
for example) $\kappa(x_i,x_j)=\Phi(x_i)\cdot \Phi(x_j)$
Polynomial Kernel :$\kappa(a,b)=(a\cdot b+c)^d$
Gaussian Kernel : $\kappa(a,b)=\exp(- |a-b|^2 / 2 \sigma^2)$
Sigmoid Kernel : $\kappa(a,b)=\tanh(c(a \cdot b)+ \theta)$
計算したkernelはカーネル行列$K$へ。これにより全ての内積をカーネル行列の要素によって置き換えることが出来た。
Input: K: Kernel matrix, k: number of clusters, w: weight for each point.
Output: $C_1, \ldots, C_k$: Partitioning of the points
Initialize the k clusters: $C_1^{(0)}, \ldots, C_k^{(0)}$
Set $t=0$
For each point $a$, find its new cluster index as $j^*(a)= argmin_j | \phi(a)-m_j|$
Compute the updated clusters as $C^{t+1}_j= { a : j^{*}(a)=j}$
If not converged, se $t = t+1$ and go to Step3; Otherwise stop.
類似度行列$W$のtraceの最大化問題からSpectral ClusternigとKernel k-meansがともにn-cutを緩和近似していることを証明。
Evaluation
Experiment
Data Set
遺伝子データ、手書き文字
Contribution
Kernel k-meansでn-cutを近似した方が目的関数の収束が速い
Contribution
Kernel k-meansとSpectral clusteringの関連性を示した
Comment
とても読みやすい論文だった。
$A$が正定値であることを仮定して、2つのアルゴリズムの同一性(N-cutの最小化)を提案しているが、$A$が一般的に正定値行列であることはない。そのため無理のある主張になっているのでは?
[2016/03/30 追記] 論文中でAffinity Matrixと言っているのはどうやら$K =D^{-1}AD^{-1}$を指しているようなので誤解でした。
[2016/03/31 追記] $D^{-1}AD^{-1}$は対角成分は零にしかなりえません。なのでやはり正定値という仮定がおかしい。
Kernel Trick : Kernlを用いて既存のアルゴリズムを拡張→何が嬉しい? Kernelを用いることで非線形な表現が可能に。そこが良い所?
k-meansでクラスタリングしたら失敗したけどどうしよう(kernel k-meansの実装)
The Global Kernel -Means Algorithm for Clustering in Feature Space
k-means Vs Kernel k-means
機械学習、数値計算、情報収集で役立つメーリングリスト
IBISML(情報論的学習理論と機械学習)
日本で開催されている学会IBISMLのメーリングリスト。機械学習に関する情報が流れてくる。
Google Groupで管理されているでの色々と便利。
https://groups.google.com/forum/?hl=ja&fromgroups#!forum/ibisml
人工知能学会
日本人工知能学会が運営するメーリングリスト。
人工知能や日本で開催される様々な学会イベントが雑多に流れてくる。
http://www.ai-gakkai.or.jp/mailing-list/
NA digest
数値解析・数値計算に関するメーリングリスト。先週先生に教えてもらって知りました。
http://www.netlib.org/na-digest-html/
研究関連でもしオススメのメーリングリストがあれば教えて下さい。

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
Free to watch • No registration required • HD streaming
Self-turing spectral clustering[NIPS2004]を実装した
Self-turing spectral clusteringで提案されているlocal scaleを実装をした(めっちゃ簡単)。 エッジに類似度を重みに持つグラフを構成するGaussian Kernelの分散値をlocalscaleと実証的に良い分散での比較を行いました。
論文自体のメモはこちら - “Self-Tuning Spectral Clustering (NIPS2004)”読んだ
localscaleの良い所は、ヒューリスティックに分散の値を決めなくて良い所と実装がめちゃくちゃ簡単な点です。
以下に簡単な実装例をGistで載せます
PRML勉強会 第10回 論文読みを開催&発表をしてきました。
PRML勉強会 #10 論文読み会
PRML勉強会 #10 論文読み まとめ| Togger