full_info,tags "real-time visual place recognition personal localization mobile device. paper presents approach indoor personal localization mobile device based visual place recognition. implemented smartphone two state-of-the-art algorithms representative two different approaches visual place recognition: fab-map recognizes places using individual images, able-m utilizes sequences images. algorithms evaluated environments different structure, focusing problems commonly encountered mobile device camera used. conclusions drawn evaluation guidelines design fastable system, based able-m algorithm, introduces major modifications concept image matching. improvements radically cut processing time improve scalability, making possible localize user long image sequences limited computing power mobile device. resulting place recognition system compares favorably able-m fab-map solutions context real-time personal localization.",3 "local optimality generalization guarantees langevin algorithm via empirical metastability. study detailed path-wise behavior discrete-time langevin algorithm non-convex empirical risk minimization (erm) lens metastability, adopting techniques berglund gentz. particular local optimum empirical risk, arbitrary initialization, show that, high probability, one two mutually exclusive events occur: either langevin trajectory ends somewhere outside $\varepsilon$-neighborhood particular optimum within short recurrence time; enters $\varepsilon$-neighborhood recurrence time stays exponentially long escape time. call phenomenon empirical metastability. two-timescale characterization aligns nicely existing literature following two senses. first, recurrence time dimension-independent, resembles convergence time deterministic gradient descent (gd). however unlike gd, langevin algorithm require strong conditions local initialization, possibility eventually visiting optima. second, scaling escape time consistent eyring-kramers law, states langevin scheme eventually visit local minima, take exponentially long time transit among them. apply path-wise concentration result context statistical learning examine local notions generalization optimality.",3 "disunited nations? multiplex network approach detecting preference affinity blocs using texts votes. paper contributes emerging literature models votes text tandem better understand polarization expressed preferences. introduces new approach estimate preference polarization multidimensional settings, international relations, based developments natural language processing network science literatures -- namely word embeddings, retain valuable syntactical qualities human language, community detection multilayer networks, locates densely connected actors across multiple, complex networks. find employment tools tandem helps better estimate states' foreign policy preferences expressed un votes speeches beyond permitted votes alone. utility located affinity blocs demonstrated application conflict onset international relations, though tools interest scholars faced measurement preferences polarization multidimensional settings.",3 "direct approach multi-class boosting extensions. boosting methods combine set moderately accurate weaklearners form highly accurate predictor. despite practical importance multi-class boosting, received far less attention binary counterpart. work, propose fully-corrective multi-class boosting formulation directly solves multi-class problem without dividing multiple binary classification problems. contrast, previous multi-class boosting algorithms decompose multi-boost problem multiple binary boosting problems. explicitly deriving lagrange dual primal optimization problem, able construct column generation-based fully-corrective approach boosting directly optimizes multi-class classification performance. new approach updates weak learners' coefficients every iteration, manner flexible enough accommodate various loss functions regularizations. example, enables us introduce structural sparsity mixed-norm regularization promote group sparsity feature sharing. boosting shared features particularly beneficial complex prediction problems features expensive compute. experiments various data sets demonstrate direct multi-class boosting generalizes well as, better than, range competing multi-class boosting methods. end result highly effective compact ensemble classifier trained distributed fashion.",3 "graph-based compression dynamic 3d point cloud sequences. paper addresses problem compression 3d point cloud sequences characterized moving 3d positions color attributes. temporally successive point cloud frames similar, motion estimation key effective compression sequences. however remains challenging problem point cloud frames varying numbers points without explicit correspondence information. represent time-varying geometry sequences set graphs, consider 3d positions color attributes points clouds signals vertices graphs. cast motion estimation feature matching problem successive graphs. motion estimated sparse set representative vertices using new spectral graph wavelet descriptors. dense motion field eventually interpolated solving graph-based regularization problem. estimated motion finally used removing temporal redundancy predictive coding 3d positions color characteristics point cloud sequences. experimental results demonstrate method able accurately estimate motion consecutive frames. moreover, motion estimation shown bring significant improvement terms overall compression performance sequence. best knowledge, first paper exploits spatial correlation inside frame (through graph) temporal correlation frames (through motion estimation) compress color geometry 3d point cloud sequences efficient way.",3 "towards end-to-end learning dialog state tracking management using deep reinforcement learning. paper presents end-to-end framework task-oriented dialog systems using variant deep recurrent q-networks (drqn). model able interface relational database jointly learn policies language understanding dialog strategy. moreover, propose hybrid algorithm combines strength reinforcement learning supervised learning achieve faster learning speed. evaluated proposed model 20 question game conversational game simulator. results show proposed method outperforms modular-based baseline learns distributed representation latent dialog state.",3 "information-theoretic optimality principle deep reinforcement learning. methodologically address problem q-value overestimation deep reinforcement learning handle high-dimensional state spaces efficiently. adapting concepts information theory, introduce intrinsic penalty signal encouraging reduced q-value estimates. resultant algorithm encompasses wide range learning outcomes containing deep q-networks special case. different learning outcomes demonstrated tuning lagrange multiplier accordingly. furthermore propose novel scheduling scheme lagrange multiplier ensure efficient robust learning. experiments atari games, algorithm outperforms algorithms (e.g. deep double deep q-networks) terms game-play performance sample complexity.",3 "taking primitive optimality theory beyond finite state. primitive optimality theory (otp) (eisner, 1997a; albro, 1998), computational model optimality theory (prince smolensky, 1993), employs finite state machine represent set active candidates stage optimality theoretic derivation, well weighted finite state machines represent constraints themselves. purposes, however, would convenient set candidates limited set criteria capable described higher-level grammar formalism, context free grammar, context sensitive grammar, multiple context free grammar (seki et al., 1991). examples include reduplication phrasal stress models. introduce mechanism otp-like optimality theory constraints remain weighted finite state machines, sets candidates represented higher-level grammars. particular, use multiple context-free grammars model reduplication manner correspondence theory (mccarthy prince, 1995), develop extended version earley algorithm (earley, 1970) apply constraints reduplicating candidate set.",3 "cahn--hilliard inpainting double obstacle potential. inpainting damaged images wide range applications many different mathematical methods proposed solve problem. inpainting witht help cahn--hilliard models particularly successful, turns cahn--hilliard inpainting double obstacle potential lead better results compared inpainting smooth double well potential. however, mathematical analysis approach missing far. paper give first analytical results cahn--hilliard double obstacle model particular show existence stationary solutions without constraints parameters involved. help numerical results show effectiveness approach binary grayscale images.",10 "characterization combined effects overlap imbalance svm classifier. paper demonstrate two common problems machine learning---imbalanced overlapping data distributions---do independent effects performance svm classifiers. result notable since shows model either factors must account presence other. study relationship problems lead discovery previously unreported form ""covert"" overfitting resilient commonly used empirical regularization techniques. demonstrate existance covert phenomenon several methods based around parametric regularization trained svms. findings area suggest possible approach quantifying overlap real world data sets.",3 "cooperative multi-agent reinforcement learning low-level wireless communication. traditional radio systems strictly co-designed lower levels osi stack compatibility efficiency. although enabled success radio communications, also introduced lengthy standardization processes imposed static allocation radio spectrum. various initiatives undertaken research community tackle problem artificial spectrum scarcity making frequency allocation dynamic building flexible radios replace static ones. reason believe computer vision control overhauled introduction machine learning, wireless communication also improved utilizing similar techniques increase flexibility wireless networks. work, pose problem discovering low-level wireless communication schemes ex-nihilo two agents fully decentralized fashion reinforcement learning problem. proposed approach uses policy gradients learn optimal bi-directional communication scheme shows surprisingly sophisticated intelligent learning behavior. present results extensive experiments analysis fidelity approach.",5 noisy expectation-maximization: applications generalizations. present noise-injected version expectation-maximization (em) algorithm: noisy expectation maximization (nem) algorithm. nem algorithm uses noise speed convergence em algorithm. nem theorem shows injected noise speeds average convergence em algorithm local maximum likelihood surface positivity condition holds. generalized form noisy expectation-maximization (nem) algorithm allow arbitrary modes noise injection including adding multiplying noise data. demonstrate noise benefits em algorithms gaussian mixture model (gmm) additive multiplicative nem noise injection. separate theorem (not presented here) shows noise benefit independent identically distributed additive noise decreases sample size mixture models. theorem implies noise benefit pronounced data sparse. injecting blind noise slowed convergence.,18 "bayesian nonparametric feature policy learning decision-making. learning demonstrations gained increasing interest recent past, enabling agent learn make decisions observing experienced teacher. many approaches proposed solve problem, little work focuses reasoning observed behavior. assume that, many practical problems, agent makes decision based latent features, indicating certain action. therefore, propose generative model states actions. inference reveals number features, features, policies, allowing us learn analyze underlying structure observed behavior. further, approach enables prediction actions new states. simulations used assess performance algorithm based upon model. moreover, problem learning driver's behavior investigated, demonstrating performance proposed model real-world scenario.",3 "trainable referring expression generation using overspecification preferences. referring expression generation (reg) models use speaker-dependent information require considerable amount training data produced every individual speaker, may otherwise perform poorly. work present simple reg experiment allows use larger training data sets grouping speakers according overspecification preferences. intrinsic evaluation shows method generally outperforms personalised method found previous work.",3 "fast scalable joint estimator learning multiple related sparse gaussian graphical models. estimating multiple sparse gaussian graphical models (sggms) jointly many related tasks (large $k$) high-dimensional (large $p$) situation important task. previous studies joint estimation multiple sggms rely penalized log-likelihood estimators involve expensive difficult non-smooth optimizations. propose novel approach, fasjem \underline{fa}st \underline{s}calable \underline{j}oint structure-\underline{e}stimation \underline{m}ultiple sggms large scale. first study joint sggm using elementary estimator framework, work three major contributions: (1) solve fasjem entry-wise manner parallelizable. (2) choose proximal algorithm optimize fasjem. improves computational efficiency $o(kp^3)$ $o(kp^2)$ reduces memory requirement $o(kp^2)$ $o(k)$. (3) theoretically prove fasjem achieves consistent estimation convergence rate $o(\log(kp)/n_{tot})$. several synthetic four real-world datasets, fasjem shows significant improvements baselines accuracy, computational complexity, memory costs.",18 "invertible orientation scores 3d images. enhancement detection elongated structures noisy image data relevant many biomedical applications. handle complex crossing structures 2d images, 2d orientation scores introduced, already showed use variety applications. extend work 3d orientation scores. first, construct orientation score given dataset, achieved invertible coherent state type transform. transformation introduce 3d versions 2d cake-wavelets, complex wavelets simultaneously detect oriented structures oriented edges. efficient implementation different steps wavelet creation use spherical harmonic transform. finally, show first results practical applications 3d orientation scores.",10 "learnability pauli noise. recently, several noise benchmarking algorithms developed characterize noisy quantum gates today's quantum devices. well-known issue benchmarking everything quantum noise learnable due existence gauge freedom, leaving open question information noise learnable not, unclear even single cnot gate. give precise characterization learnability pauli noise channels attached clifford gates, showing learnable information corresponds cycle space pattern transfer graph gate set, unlearnable information corresponds cut space. implies optimality cycle benchmarking, sense learn learnable information pauli noise. experimentally demonstrate noise characterization ibm's cnot gate 2 unlearnable degrees freedom, obtain bounds using physical constraints. addition, give attempt characterize unlearnable information assuming perfect initial state preparation. however, based experimental data, conclude assumption inaccurate yields unphysical estimates, obtain lower bound state preparation noise.",17 "fast robust methods singular state-space models. state-space models used wide range time series analysis formulations. kalman filtering smoothing work-horse algorithms settings. classic algorithms assume gaussian errors simplify estimation, recent advances use broader range optimization formulations allow outlier-robust estimation, well constraints capture prior information. develop methods state-space models either innovations error covariances may singular. models frequently arise navigation (e.g. `colored noise' models deterministic integrals) ubiquitous auto-correlated time series models arma. reformulate state-space models (singular well nonsinguar) constrained convex optimization problems, develop efficient algorithm reformulation. convergence rate {\it locally linear}, constants depend conditioning problem. numerical comparisons show new approach outperforms competing approaches {\it nonsingular} models, including state art interior point (ip) methods. ip methods converge superlinear rates; expect dominate. however, steep rate proposed approach (independent problem conditioning) combined cheap iterations wins ip run-time comparison. therefore suggest proposed approach {\it default choice} estimating state space models outside gaussian context, regardless whether error covariances singular not.",10 "material recognition cnns hierarchical planning biped robot locomotion slippery terrain. paper tackle problem visually predicting surface friction environments diverse surfaces, integrating knowledge biped robot locomotion planning. problem essential autonomous robot locomotion since diverse surfaces varying friction abound real world, wood ceramic tiles, grass ice, may cause difficulties huge energy costs robot locomotion considered. propose estimate friction uncertainty visual estimation material classes using convolutional neural networks, together probability distribution functions friction associated material. robustly integrate friction predictions hierarchical (footstep full-body) planning method using chance constraints, optimize trajectory costs levels planning method consistency. solution achieves fully autonomous perception locomotion slippery terrain, considers friction uncertainty, also collision, stability trajectory cost. show promising friction prediction results real pictures outdoor scenarios, planning experiments real robot facing surfaces different friction.",3 "computing functions random variables via reproducing kernel hilbert space representations. describe method perform functional operations probability distributions random variables. method uses reproducing kernel hilbert space representations probability distributions, applicable operations applied points drawn respective distributions. refer approach {\em kernel probabilistic programming}. illustrate synthetic data, show used nonparametric structural equation models, application causal inference.",18 "implementation back-propagation learning gf11, large simd parallel computer. current connectionist simulations require huge computational resources. describe neural network simulator ibm gf11, experimental simd machine 566 processors peak arithmetic performance 11 gigaflops. present parallel implementation backpropagation learning algorithm, techniques increasing efficiency, performance measurements nettalk text-to-speech benchmark, performance model simulator. simulator currently runs back-propagation learning algorithm 900 million connections per second, ""connection per second"" includes forward backward pass. figure obtained machine 356 processors working; 566 processors operational, simulation run one billion connections per second. conclude gf11 well-suited neural network simulation, analyze use machine determine features important high performance.",3 "lamarckism mechanism synthesis: approaching constrained optimization ideas biology. nonlinear constrained optimization problems encountered many scientific fields. utilize huge calculation power current computers, many mathematic models also rebuilt optimization problems. constrained conditions need handled. borrowing biological concepts, study accomplished dealing constraints synthesis four-bar mechanism. biologically regarding constrained condition form selection characteristics population, four new algorithms proposed, new explanation given penalty method. using algorithms, three cases tested differential-evolution based programs. better, comparable, results show presented algorithms methodology may become common means constraint handling optimization problems.",10 paddlespeech: easy-to-use all-in-one speech toolkit. paddlespeech open-source all-in-one speech toolkit. aims facilitating development research speech processing technologies providing easy-to-use command-line interface simple code structure. paper describes design philosophy core architecture paddlespeech support several essential speech-to-text text-to-speech tasks. paddlespeech achieves competitive state-of-the-art performance various speech datasets implements popular methods. also provides recipes pretrained models quickly reproduce experimental results paper. paddlespeech publicly avaiable https://github.com/paddlepaddle/paddlespeech.,5 "revisiting stochastic off-policy action-value gradients. off-policy stochastic actor-critic methods rely approximating stochastic policy gradient order derive optimal policy. one may also derive optimal policy approximating action-value gradient. use action-value gradients desirable policy improvement occurs along direction steepest ascent. studied extensively within context natural gradient actor-critic algorithms recently within context deterministic policy gradients. paper briefly discuss off-policy stochastic counterpart deterministic action-value gradients, well incremental approach following policy gradient lieu natural gradient.",18 "fault-tolerant, paradoxical path-finding physical conceptual systems. report initial investigations reliability path-finding based models propose future areas interest. inspired broken sidewalks on-campus construction projects, develop two models navigating ""unreliable network."" based concept ""accumulating risk"" backward destination, operate directed acyclic graphs probability failure associated edge. first serves introduce faults addressed second, conservative model. next, show paradox models used construct polynomials conceptual networks, design processes software development life cycles. risk network increases uniformly, reliable path changes wider longer shorter narrower. let professional inexperience--such entry level cooks software developers--represent probability edge failure, change path imply novice follow instructions fewer ""back-up"" plans, yet alternative routes followed expert?",3 "biodynamo project: platform computer simulations biological dynamics. paper brief update developments biodynamo project, new platform computer simulations biological research. discuss new capabilities simulator, important new concepts simulation methodology well numerous applications computational biology nanoscience communities.",3 "pathological oct retinal layer segmentation using branch residual u-shape networks. automatic segmentation retinal layer structures enables clinically-relevant quantification monitoring eye disorders time oct imaging. eyes late-stage diseases particularly challenging segment, shape highly warped due pathological biomarkers. context, propose novel fully convolutional neural network (cnn) architecture combines dilated residual blocks asymmetric u-shape configuration, segment multiple layers highly pathological eyes one shot. validate approach dataset late-stage amd patients demonstrate lower computational costs higher performance compared state-of-the-art methods.",3 "calibration changing checking rules application short-term trading. provide natural learning process financial trader without risk receives gain case stock market inefficient. process, trader rationally choose gambles using prediction made randomized calibrated algorithm. strategy based dawid's notion calibration general changing checking rules modification kakade foster's randomized algorithm computing calibrated forecasts.",3 "autonomous agents modelling agents: comprehensive survey open problems. much research artificial intelligence concerned development autonomous agents interact effectively agents. important aspect agents ability reason behaviours agents, constructing models make predictions various properties interest (such actions, goals, beliefs) modelled agents. variety modelling approaches exist vary widely methodology underlying assumptions, catering needs different sub-communities within developed reflecting different practical uses intended. purpose present article provide comprehensive survey salient modelling methods found literature. article concludes discussion open problems may form basis fruitful future research.",3 "graph quantization. vector quantization(vq) lossy data compression technique signal processing, restricted feature vectors therefore inapplicable combinatorial structures. contribution presents theoretical foundation graph quantization (gq) extends vq domain attributed graphs. present necessary lloyd-max conditions optimality graph quantizer consistency results optimal gq design based empirical distortion measures stochastic optimization. results statistically justify existing clustering algorithms domain graphs. proposed approach provides template link structural pattern recognition methods gq statistical pattern recognition.",3 "deep visual attention prediction. work, aim predict human eye fixation view-free scenes based end-to-end deep learning architecture. although convolutional neural networks (cnns) made substantial improvement human attention prediction, still needed improve cnn based attention models efficiently leveraging multi-scale features. visual attention network proposed capture hierarchical saliency information deep, coarse layers global saliency information shallow, fine layers local saliency response. model based skip-layer network structure, predicts human attention multiple convolutional layers various reception fields. final saliency prediction achieved via cooperation global local predictions. model learned deep supervision manner, supervision directly fed multi-level layers, instead previous approaches providing supervision output layer propagating supervision back earlier layers. model thus incorporates multi-level saliency predictions within single network, significantly decreases redundancy previous approaches learning multiple network streams different input scales. extensive experimental analysis various challenging benchmark datasets demonstrate method yields state-of-the-art performance competitive inference time.",3 "filamentary structures ionized gas cygnus x. ionized gas probes influence massive stars environment. cygnus x region (d~1.5 kpc) one massive star forming complexes galaxy, cyg ob2 association (age 3-5 myr stellar mass $2 \times 10^{4}$ m$_{\odot}$) dominant influence. observe cygnus x region 148 mhz using low frequency array (lofar) take account short-spacing information image deconvolution. together data canadian galactic plane survey, investigate morphology, distribution, physical conditions low-density ionized gas $4^{\circ} \times 4^{\circ}$ (100 pc $\times$ 100 pc) region resolution 2' (0.9 pc). galactic radio emission region analyzed almost entirely thermal (free-free) 148 mhz, emission measures $10^3 < em~{\rm[pc~cm^{-6}]} < 10^6$. filamentary structure prominent feature emission, use disperse filchap identify filamentary ridges characterize radial ($em$) profiles. distribution radial profiles characteristic width 4.3 pc power-law distribution ($\beta = -1.8 \pm 0.1$) peak $em$ completeness limit 4200 pc cm$^{-6}$. electron densities filamentary structure range $10 < n_e~{\rm[cm^{-3}]} < 400$ median value 35 cm$^{-3}$, remarkably similar [n ii] surveys ionized gas. cyg ob2 may ionize two-thirds total ionized gas ionized gas filaments. half filamentary structures likely photoevaporating surfaces flowing surrounding diffuse (~5 cm$^{-3}$) medium. however, likely case ionized gas ridges. characteristic width distribution ionized gas points stellar winds cyg ob2 creating fraction ionized filaments swept-up ionized gas dissipated turbulence.",0 "nonconvex penalties analytical solutions one-bit compressive sensing. one-bit measurements widely exist real world, used recover sparse signals. task known problem learning halfspaces learning theory one-bit compressive sensing (1bit-cs) signal processing. paper, propose novel algorithms based convex nonconvex sparsity-inducing penalties robust 1bit-cs. provide sufficient condition verify whether solution globally optimal not. show globally optimal solution positive homogeneous penalties obtained two steps: proximal operator normalization step. several nonconvex penalties, including minimax concave penalty (mcp), $\ell_0$ norm, sorted $\ell_1$ penalty, provide fast algorithms finding analytical solutions solving dual problem. specifically, algorithm $200$ times faster existing algorithm mcp. efficiency comparable algorithm $\ell_1$ penalty time, performance much better. among penalties, sorted $\ell_1$ penalty robust noise different settings.",3 "entropy-based closure probabilistic learning manifolds. recent paper, authors proposed general methodology probabilistic learning manifolds. method used generate numerical samples statistically consistent existing dataset construed realization non-gaussian random vector. manifold structure learned using diffusion manifolds statistical sample generation accomplished using projected ito stochastic differential equation. probabilistic learning approach extended polynomial chaos representation databases manifolds probabilistic nonconvex constrained optimization fixed budget function evaluations. methodology introduces isotropic-diffusion kernel hyperparameter {\epsilon}. currently, {\epsilon} less arbitrarily chosen. paper, propose selection criterion identifying optimal value {\epsilon}, based maximum entropy argument. result comprehensive, closed, probabilistic model characterizing data sets hidden constraints. entropy argument ensures possible models, one uncertain beyond specified constraints, selected. applications presented several databases.",10 "threshold solutions intercritical inhomogeneous nls. consider focusing inhomogeneous nonlinear schr\""odinger equation $h^1(\mathbb{r}^3)$, \begin{equation} i\partial_t u + \delta u + |x|^{-b}|u|^{2}u=0,{equation} $0 < b <\tfrac{1}{2}$. previous works established blowup/scattering dichotomy mass-energy threshold determined ground state solution $q$. work, study solutions exactly mass-energy threshold. addition ground state solution, prove existence solutions $q^\pm$, approach standing wave positive time direction, either blow scatter negative time direction. using particular solutions, classify possible behaviors threshold solutions. particular, solution either behaves sub-threshold case, agrees $e^{it}q$, $q^+$, $q^-$ symmetries equation.",10 "online multiple kernel learning structured prediction. despite recent progress towards efficient multiple kernel learning (mkl), structured output case remains open research front. current approaches involve repeatedly solving batch learning problem, makes inadequate large scale scenarios. propose new family online proximal algorithms mkl (as well group-lasso variants thereof), overcomes drawback. show regret, convergence, generalization bounds proposed method. experiments handwriting recognition dependency parsing testify successfulness approach.",18 "effects data size frequency range distributional semantic models. paper investigates effects data size frequency range distributional semantic models. compare performance number representative models several test settings data varying sizes, test items various frequency. results show neural network-based models underperform data small, reliable model data varying sizes frequency ranges inverted factorized model.",3 "preadapted universal switch distribution testing hilberg's conjecture. hilberg's conjecture natural language states mutual information two adjacent long blocks text grows like power block length. exponent statement upper bounded using pointwise mutual information estimate computed carefully chosen code. bound better, lower compression rate requirement code universal. improve received upper bound hilberg's exponent, paper, introduce two novel universal codes, called plain switch distribution preadapted switch distribution. generally speaking, switch distributions certain mixtures adaptive markov chains varying orders additional communication avoid called catch-up phenomenon. advantage distributions achieve low compression rate guaranteed universal. using switch distributions obtain sample text english non-markovian hilberg's exponent $\le 0.83$, improves previous bound $\le 0.94$ obtained using lempel-ziv code.",3 "machine learning meets network science: dimensionality reduction fast efficient embedding networks hyperbolic space. complex network topologies hyperbolic geometry seem specularly connected, one fascinating challenging problems recent complex network theory map given network hyperbolic space. popularity similarity optimization (pso) model represents - moment - climax theory. suggests trade-off node popularity similarity mechanism explain complex network topologies emerge - discrete samples - continuous world hyperbolic geometry. hyperbolic space seems appropriate represent real complex networks. fact, preserves many fundamental topological properties, exploited real applications as, among others, link prediction community detection. here, observe first time topological-based machine learning class algorithms - nonlinear unsupervised dimensionality reduction - directly approximate network's node angular coordinates hyperbolic model two-dimensional space, according similar topological organization named angular coalescence. basis phenomenon, propose new class algorithms offers fast accurate coalescent embedding networks hyperbolic space even graphs thousands nodes.",2 "price anarchy auctions. survey outlines general modular theory proving approximation guarantees equilibria auctions complex settings. theory complements traditional economic techniques, generally focus exact optimal solutions accordingly limited relatively stylized settings. highlight three user-friendly analytical tools: smoothness-type inequalities, immediately yield approximation guarantees many auction formats interest special case complete information deterministic strategies; extension theorems, extend guarantees randomized strategies, no-regret learning outcomes, incomplete-information settings; composition theorems, extend guarantees simpler complex auctions. combining tools yields tight worst-case approximation guarantees equilibria many widely-used auction formats.",3 "hybrid gradient boosting trees neural networks forecasting operating room data. time series data constitutes distinct growing problem machine learning. corpus time series data grows larger, deep models simultaneously learn features classify features intractable suboptimal. paper, present feature learning via long short term memory (lstm) networks prediction via gradient boosting trees (xgb). focusing consequential setting electronic health record data, predict occurrence hypoxemia five minutes future based past features. make two observations: 1) long short term memory networks effective capturing long term dependencies based single feature 2) gradient boosting trees capable tractably combining large number features including static features like height weight. observations mind, generate features performing ""supervised"" representation learning lstm networks. augmenting original xgb model features gives significantly better performance either individual method.",3 "combinatorial solution non-rigid 3d shape-to-image matching. propose combinatorial solution problem non-rigidly matching 3d shape 3d image data. end, model shape triangular mesh allow triangle mesh rigidly transformed achieve suitable matching image. penalising distance relative rotation neighbouring triangles matching compromises image shape information. paper, resolve two major challenges: firstly, address resulting large np-hard combinatorial problem suitable graph-theoretic approach. secondly, propose efficient discretisation unbounded 6-dimensional lie group se(3). knowledge first combinatorial formulation non-rigid 3d shape-to-image matching. contrast existing local (gradient descent) optimisation methods, obtain solutions require good initialisation within bound optimal solution. evaluate proposed method two problems non-rigid 3d shape-to-shape non-rigid 3d shape-to-image registration demonstrate provides promising results.",3 "semantic social network analysis. social network analysis (sna) tries understand exploit key features social networks order manage life cycle predict evolution. increasingly popular web 2.0 sites forming huge social network. classical methods social network analysis (sna) applied online networks. paper, propose leveraging semantic web technologies merge exploit best features domain. present facilitate enhance analysis online social networks, exploiting power semantic social network analysis.",3 "fully-charm -bottom pentaquarks lattice-qcd inspired quark model. fully-charm -bottom pentaquarks, \emph{i.e.} $cccc\bar{c}$ $bbbb\bar{b}$, spin-parity quantum numbers $j^p=\frac{1}{2}^-$, $\frac{3}{2}^-$ $\frac{5}{2}^-$, investigated within lattice-qcd inspired quark model, already successfully described recently announced fully-charm tetraquark candidate $x(6900)$, also predicted several fully-heavy tetraquarks. powerful computational technique, based gaussian expansion method combined complex-scaling range approach, employed predict, distinguish, bound, resonance scattering states mentioned five-body system. baryon-meson diquark-diquark-antiquark configurations, along possible color channels comprehensively considered. narrow resonances obtained spin-parity channel fully-charm -bottom systems. moreover, seems compact multiquarks whose wave-functions dominated either hidden-color baryon-meson diquark-diquark-antiquark structure, coupling them.",8 "efficient vvc intra prediction based deep feature fusion probability estimation. ever-growing multimedia traffic underscored importance effective multimedia codecs. among them, up-to-date lossy video coding standard, versatile video coding (vvc), attracting attentions video coding community. however, gain vvc achieved cost significant encoding complexity, brings need realize fast encoder comparable rate distortion (rd) performance. paper, propose optimize vvc complexity intra-frame prediction, two-stage framework deep feature fusion probability estimation. first stage, employ deep convolutional network extract spatialtemporal neighboring coding features. fuse reference features obtained different convolutional kernels determine optimal intra coding depth. second stage, employ probability-based model spatial-temporal coherence select candidate partition modes within optimal coding depth. finally, selected depths partitions executed whilst unnecessary computations excluded. experimental results standard database demonstrate superiority proposed method, especially high definition (hd) ultra-hd (uhd) video sequences.",5 "learnable frequency filters speech feature extraction speaker verification. mel-scale spectrum features used various recognition classification tasks speech signals. reason expect features optimal different tasks, including speaker verification (sv). paper describes learnable front-end feature extraction model. model comprises group filters transform fourier spectrum. model parameters define filters trained end-to-end optimized specifically task speaker verification. compared standard mel-scale filter-bank, filters' bandwidths center frequencies adjustable. experimental results show applying learnable acoustic front-end improves speaker verification performance conventional mel-scale spectrum features. analysis learned filter parameters suggests narrow-band information benefits sv system performance. proposed model achieves good balance performance computation cost. resource-constrained computation settings, model significantly outperforms cnn-based learnable front-ends. generalization ability proposed model also demonstrated different embedding extraction models datasets.",5 "fastmap algorithm shortest path computations. present new preprocessing algorithm embedding nodes given edge-weighted undirected graph euclidean space. euclidean distance two nodes space approximates length shortest path given graph. later, runtime, shortest path two nodes computed a* search using euclidean distances heuristic. preprocessing algorithm, called fastmap, inspired data mining algorithm name runs near-linear time. hence, fastmap orders magnitude faster competing approaches produce euclidean embedding using semidefinite programming. fastmap also produces admissible consistent heuristics therefore guarantees generation shortest paths. moreover, fastmap applies general undirected graphs many traditional heuristics, manhattan distance heuristic, well defined. empirically, demonstrate a* search using fastmap heuristic competitive a* search using state-of-the-art heuristics, differential heuristic.",3 "evolutionary artificial neural network based chemical reaction optimization. evolutionary algorithms (eas) popular tools design evolve artificial neural networks (anns), especially train them. methods advantages conventional backpropagation (bp) method low computational requirement searching large solution space. paper, employ chemical reaction optimization (cro), newly developed global optimization method, replace bp training neural networks. cro population-based metaheuristics mimicking transition molecules interactions chemical reaction. simulation results show cro outperforms many ea strategies commonly used train neural networks.",3 "impact gravitational lensing black hole mass function inference third-generation gravitational wave detectors. recent rapid growth black hole (bh) catalog gravitational waves (gws), allowed us study substructure black hole mass function (bhmf) beyond simplest power-law distribution. however, bh masses inferred binary bh merger events, may systematically 'brightened' 'dimmed' gravitational lensing effect. work, investigate impact gravitational lensing bhmf inference considering detection third-generation gw detector -- einstein telescope (et). focus high redshift, $z=10$, order obtain upper-limits effect. use monte carlo(mc) method simulate data adopting 3 original bhmfs un-lensed lensed scenarios, recovery parameters bhmfs mock data, compare difference results, respectively. found parameters well recovered within one standard deviation(std., 1$\sigma$), 3 bhmf models reconstructed within 68\% credible interval, suggesting lensing would change main structure drastically, even high redshifts high precision et. modest influence beyond $50m_{\odot}$, depends modeling high mass tail substructure bhmf. conclude impact lensing bhmf inference et safely ignored foreseeable future. careful handling lensing effects required focus accurate estimation high mass end bhmf high redshifts.",0 "observation same-sign ww production double parton scattering proton-proton collisions $\sqrt{s}$ = 13 tev. first observation production w$^\pm$w$^\pm$ bosons double parton scattering processes using same-sign electron-muon dimuon events proton-proton collisions reported. data sample corresponds integrated luminosity 138 fb$^{-1}$ recorded center-of-mass energy 13 tev using cms detector cern lhc. multivariate discriminants used distinguish signal process main backgrounds. binned maximum likelihood fit performed extract signal cross section. measured cross section production same-sign w bosons decaying leptonically 80.7 $\pm$ 11.2 (stat) $^{+9.5}_{-8.6}$ (syst) $\pm$ 12.1 (model) fb, whereas measured fiducial cross section 6.28 $\pm$ 0.81 (stat) $\pm$ 0.69 (syst) $\pm$ 0.37 (model) fb. observed significance signal 6.2 standard deviations background-only hypothesis.",7 calibration one-class svm mv set estimation. general approach anomaly detection novelty detection consists estimating high density regions minimum volume (mv) sets. one-class support vector machine (ocsvm) state-of-the-art algorithm estimating regions high dimensional data. yet suffers practical limitations. applied limited number samples lead poor performance even picking best hyperparameters. moreover solution ocsvm sensitive selection hyperparameters makes hard optimize unsupervised setting. present new approach estimate mv sets using ocsvm different choice parameter controlling proportion outliers. solution function ocsvm learnt training set desired probability mass obtained adjusting offset test set prevent overfitting. models learnt different train/test splits aggregated reduce variance induced random splits. approach makes possible tune hyperparameters automatically obtain nested set estimates. experimental results show approach outperforms standard ocsvm formulation suffering less curse dimensionality kernel density estimates. results actual data sets also presented.,18 "vqa-machine: learning use existing vision algorithms answer new questions. one intriguing features visual question answering (vqa) challenge unpredictability questions. extracting information required answer demands variety image operations detection counting, segmentation reconstruction. train method perform even one operations accurately {image,question,answer} tuples would challenging, aim achieve limited set training data seems ambitious best. propose instead general scalable approach exploits fact good methods achieve operations already exist, thus need trained. method thus learns exploit set external off-the-shelf algorithms achieve goal, approach something common neural turing machine. core proposed method new co-attention model. addition, proposed approach generates human-readable reasons decision, still trained end-to-end without ground truth reasons given. demonstrate effectiveness two publicly available datasets, visual genome vqa, show produces state-of-the-art results cases.",3 "majority vote diverse classifiers late fusion. past years, lot attention devoted multimedia indexing fusing multimodal informations. two kinds fusion schemes generally considered: early fusion late fusion. focus late classifier fusion, one combines scores modality decision level. tackle problem, investigate recent elegant well-founded quadratic program named mincq coming machine learning pac-bayesian theory. mincq looks weighted combination, set real-valued functions seen voters, leading lowest misclassification rate, maximizing voters' diversity. propose extension mincq tailored multimedia indexing. method based order-preserving pairwise loss adapted ranking allows us improve mean averaged precision measure taking account diversity voters want fuse. provide evidence method naturally adapted late fusion procedures confirm good behavior approach challenging pascal voc'07 benchmark.",18 "algorithm quasi-associative quasi-markovian rules combination information fusion. paper one proposes simple algorithm combining fusion rules, rules first use conjunctive rule transfer conflicting mass non-empty sets, way gain property associativity fulfill markovian requirement dynamic fusion. also, new rule, sdl-improved, presented.",3 "dimensionality-reduced subspace clustering. subspace clustering refers problem clustering unlabeled high-dimensional data points union low-dimensional linear subspaces, whose number, orientations, dimensions unknown. practice one may access dimensionality-reduced observations data only, resulting, e.g., undersampling due complexity speed constraints acquisition device mechanism. pertinently, even high-dimensional data set available often desirable first project data points lower-dimensional space perform clustering there; reduces storage requirements computational cost. purpose paper quantify impact dimensionality reduction random projection performance three subspace clustering algorithms, based principles sparse signal recovery. specifically, analyze thresholding based subspace clustering (tsc) algorithm, sparse subspace clustering (ssc) algorithm, orthogonal matching pursuit variant thereof (ssc-omp). find, three algorithms, dimensionality reduction order subspace dimensions possible without incurring significant performance degradation. moreover, results order-wise optimal sense reducing dimensionality leads fundamentally ill-posed clustering problem. findings carry noisy case illustrated analytical results tsc simulations ssc ssc-omp. extensive experiments synthetic real data complement theoretical findings.",18 "deep architecture unified aesthetic prediction. image aesthetics become important criterion visual content curation social media sites media content repositories. previous work aesthetic prediction models computer vision community focused aesthetic score prediction binary image labeling. however, raw aesthetic annotations form score histograms provide richer precise information binary labels mean scores. consequently, work focus rarely-studied problem predicting aesthetic score distributions propose novel architecture training procedure model. model achieves state-of-the-art results standard ava large-scale benchmark dataset three tasks: (i) aesthetic quality classification; (ii) aesthetic score regression; (iii) aesthetic score distribution prediction, using one model trained distribution prediction task. also introduce method modify image predicted aesthetics changes, use modification gain insight model.",3 "nearly tight bounds $\ell_1$ approximation self-bounding functions. study complexity learning approximation self-bounding functions uniform distribution boolean hypercube ${0,1}^n$. informally, function $f:{0,1}^n \rightarrow \mathbb{r}$ self-bounding every $x \in {0,1}^n$, $f(x)$ upper bounds sum $n$ marginal decreases value function $x$. self-bounding functions include well-known classes functions submodular fractionally-subadditive (xos) functions. introduced boucheron et al. context concentration measure inequalities. main result nearly tight $\ell_1$-approximation self-bounding functions low-degree juntas. specifically, self-bounding functions $\epsilon$-approximated $\ell_1$ polynomial degree $\tilde{o}(1/\epsilon)$ $2^{\tilde{o}(1/\epsilon)}$ variables. show degree junta-size optimal logarithmic terms. previous techniques considered stronger $\ell_2$ approximation proved nearly tight bounds $\theta(1/\epsilon^{2})$ degree $2^{\theta(1/\epsilon^2)}$ number variables. bounds rely analysis noise stability self-bounding functions together stronger connection noise stability $\ell_1$ approximation low-degree polynomials. technique also used get tighter bounds $\ell_1$ approximation low-degree polynomials faster learning algorithm halfspaces. results lead improved several cases almost tight bounds pac agnostic learning self-bounding functions relative uniform distribution. particular, assuming hardness learning juntas, show pac agnostic learning self-bounding functions complexity $n^{\tilde{\theta}(1/\epsilon)}$.",3 "left co-kothe generalized co-kothe rings solving old problem kothe non-commutative rings. solve classical kothes problem, concerning structure non-commutative rings property that: every left module direct sum cyclic modules. ring $r$ left (resp., right) kothe every left (resp., right) r-module direct sum cyclic $r$-modules. kothe [math. z. 39 (1934), 31-44] showed artinian principal ideal rings left kothe rings. cohen kaplansky [math.z 54 (1951), 97-101] proved commutative kothe rings artinian principal ideal rings. faith [math. ann. 164 (1966), 207-212] characterized commutative rings whose proper factor rings kothe rings. however, nakayama [proc. imp. acad. japan 16 (1940), 285-289] gave example left kothe ring principal ideal ring. kawada [sci. rep. tokyo kyoiku daigaku, sect. 7 (1962), 154-230; sect. 8 (1965),165-250] completely solved kothes problem basic finite dimensional algebras. far kothe's problem still open non-commutative setting. paper, among related results, solve kothes problem ring. also determine structure left co-kothe rings (rings whose left modules direct sums co-cyclics modules). finally, application, present several characterizations left kawada rings, generalizes well-known result ringel finite dimensional kawada algebras (a ring r called left kawada ring morita equivalent r left kothe ring).",10 "multiagent approach representation information decision support system. emergency situation, actors need assistance allowing react swiftly efficiently. prospect, present paper decision support system aims prepare actors crisis situation thanks decision-making support. global architecture system presented first part. focus part system designed represent information current situation. part composed multiagent system made factual agents. agent carries semantic feature aims represent partial part situation. agents develop thanks interactions comparing semantic features using proximity measures according specific ontologies.",3 "cortical prediction markets. investigate cortical learning perspective mechanism design. first, show discretizing standard models neurons synaptic plasticity leads rational agents maximizing simple scoring rules. second, main result scoring rules proper, implying neurons faithfully encode expected utilities synaptic weights encode high-scoring outcomes spikes. third, foundation hand, propose biologically plausible mechanism whereby neurons backpropagate incentives allows optimize usefulness rest cortex. finally, experiments show networks backpropagate incentives learn simple tasks.",3 "transductive adversarial networks (tan). transductive adversarial networks (tan) novel domain-adaptation machine learning framework designed learning conditional probability distribution unlabelled input data target domain, also access to: (1) easily obtained labelled data related source domain, may different conditional probability distribution target domain, (2) marginalised prior distribution labels target domain. tan leverages fully adversarial training procedure unique generator/encoder architecture approximates transductive combination available source- target-domain data. benefit tan allows distance source- target-domain label-vector marginal probability distributions greater 0 (i.e. different tasks across source target domains) whereas domain-adaptation algorithms require distance equal 0 (i.e. single task across source target domains). tan can, however, still handle latter case generalised approach case. another benefit tan due fully adversarial algorithm, potential accurately approximate highly complex distributions. theoretical analysis demonstrates viability tan framework.",18 "rgb-d salient object detection based discriminative cross-modal transfer learning. work, propose utilize convolutional neural networks boost performance depth-induced salient object detection capturing high-level representative features depth modality. formulate depth-induced saliency detection cnn-based cross-modal transfer problem bridge gap ""data-hungry"" nature cnns unavailability sufficient labeled training data depth modality. proposed approach, leverage auxiliary data source modality effectively training rgb saliency detection network obtain task-specific pre-understanding layers target modality. meanwhile, exploit depth-specific information pre-training modality classification network encourages modal-specific representations optimizing course. thus, could make feature representations rgb depth modalities discriminative possible. two modules pre-trained independently stitched initialize optimize eventual depth-induced saliency detection model. experiments demonstrate effectiveness proposed novel pre-training strategy well significant consistent improvements proposed approach state-of-the-art methods.",3 "unsupervised feature learning writer identification writer retrieval. deep convolutional neural networks (cnn) shown great success supervised classification tasks character classification dating. deep learning methods typically need lot annotated training data, available many scenarios. cases, traditional methods often better equivalent deep learning methods. paper, propose simple, yet effective, way learn cnn activation features unsupervised manner. therefore, train deep residual network using surrogate classes. surrogate classes created clustering training dataset, cluster index represents one surrogate class. activations penultimate cnn layer serve features subsequent classification tasks. evaluate feature representations two publicly available datasets. focus lies icdar17 competition dataset historical document writer identification (historical-wi). show activation features trained without supervision superior descriptors state-of-the-art writer identification methods. additionally, achieve comparable results case handwriting classification using icfhr16 competition dataset historical latin script types (clamm16).",3 "directed reduction algorithms decomposable graphs. recent years, intense research efforts develop efficient methods probabilistic inference probabilistic influence diagrams belief networks. many people concluded best methods based undirected graph structures, methods inherently superior based node reduction operations influence diagram. show two approaches essentially same, since explicitly implicity building operating underlying graphical structures. paper examine graphical structures show insight lead improved class directed reduction methods.",3 "scale coding bag deep features human attribute action recognition. approaches human attribute action recognition still images based image representation multi-scale local features pooled across scale single, scale-invariant encoding. bag-of-words recently popular representations based convolutional neural networks, local features computed multiple scales. however, multi-scale convolutional features pooled single scale-invariant representation. argue entirely scale-invariant image representations sub-optimal investigate approaches scale coding within bag deep features framework. approach encodes multi-scale information explicitly image encoding stage. propose two strategies encode multi-scale information explicitly final image representation. validate two scale coding techniques five datasets: willow, pascal voc 2010, pascal voc 2012, stanford-40 human attributes (hat-27). datasets, proposed scale coding approaches outperform scale-invariant method standard deep features network. further, combining scale coding approaches standard deep features leads consistent improvement state-of-the-art.",3 "role commutativity constraint propagation algorithms. constraint propagation algorithms form important part constraint programming systems. provide simple, yet general framework allows us explain several constraint propagation algorithms systematic way. framework proceed two steps. first, introduce generic iteration algorithm partial orderings prove correctness abstract setting. instantiate algorithm specific partial orderings functions obtain specific constraint propagation algorithms. particular, using notions commutativity semi-commutativity, show {\tt ac-3}, {\tt pc-2}, {\tt dac} {\tt dpc} algorithms achieving (directional) arc consistency (directional) path consistency instances single generic algorithm. work reported extends simplifies apt \citeyear{apt99b}.",3 "fact sheet semantic web. report gives overview activities topic semantic web. released technical report project ""ktweb -- connecting knowledge technologies communities"" 2003.",3 "$\mathcal{o}(n\log n)$ projection operator weighted $\ell_1$-norm regularization sum constraint. provide simple efficient algorithm projection operator weighted $\ell_1$-norm regularization subject sum constraint, together elementary proof. implementation proposed algorithm downloaded author's homepage.",3 "image denoising restoration cnn-lstm encoder decoder direct attention. image denoising always challenging task field computer vision image processing. paper, proposed encoder-decoder model direct attention, capable denoising reconstruct highly corrupted images. model consists encoder decoder, encoder convolutional neural network decoder multilayer long short-term memory network. proposed model, encoder reads image catches abstraction image vector, decoder takes vector well corrupted image reconstruct clean image. trained model mnist handwritten digit database making lower half every image black well adding noise top that. massive destruction images hard human understand content images, model retrieve image minimal error. proposed model compared convolutional encoder-decoder, model performed better generating missing part images convolutional autoencoder.",18 "structure-aware temporally coherent 3d human pose estimation. deep learning methods 3d human pose estimation rgb images require huge amount domain-specific labeled data good in-the-wild performance. however, obtaining annotated 3d pose data requires complex motion capture setup generally limited controlled settings. propose semi-supervised learning method using structure-aware loss function able utilize abundant 2d data learn 3d information. furthermore, present simple temporal network uses additional context present pose sequences improve temporally harmonize pose estimates. complete pipeline improves upon state-of-the-art 11.8% works 30 fps commodity graphics card.",3 "exponentially consistent kernel two-sample tests. given two sets independent samples unknown distributions $p$ $q$, two-sample test decides whether reject null hypothesis $p=q$. recent attention focused kernel two-sample tests test statistics easy compute, converge fast, low bias finite sample estimates. however, still lacks exact characterization asymptotic performance tests, particular, rate type-ii error probability decays zero large sample limit. work, show class kernel two-sample tests exponentially consistent polish, locally compact hausdorff space, e.g., $\mathbb r^d$. obtained exponential decay rate shown optimal among two-sample tests meeting given level constraint, independent particular kernels provided bounded continuous characteristic. key approach extended version sanov's theorem recent result identifies maximum mean discrepancy metric weak convergence probability measures.",18 "fitting simplicial complex using variation k-means. give simple effective two stage algorithm approximating point cloud $\mathcal{s}\subset\mathbb{r}^m$ simplicial complex $k$. first stage iterative fitting procedure generalizes k-means clustering, second stage involves deleting redundant simplices. form dimension reduction $\mathcal{s}$ obtained consequence.",3 "dynamics based features graph classification. numerous social, medical, engineering biological challenges framed graph-based learning tasks. here, propose new feature based approach network classification. show dynamics network useful reveal patterns organization components underlying graph process takes place. define generalized assortativities networks use generalized features across multiple time scales. features turn suitable signatures discriminating different classes networks. method evaluated empirically established network benchmarks. also introduce new dataset human brain networks (connectomes) use evaluate method. results reveal dynamics based features competitive often outperform state art accuracies.",18 "somz: photometric redshift pdfs self organizing maps random atlas. paper explore applicability unsupervised machine learning technique self organizing maps (som) estimate galaxy photometric redshift probability density functions (pdfs). technique takes spectroscopic training set, maps photometric attributes, redshifts, two dimensional surface using process competitive learning neurons compete closely resemble training data multidimensional space. key feature som retains topology input set, revealing correlations attributes easily identified. test three different 2d topological mapping: rectangular, hexagonal, spherical, using data deep2 survey. also explore different implementations boundary conditions map also introduce idea random atlas large number different maps created individual predictions aggregated produce robust photometric redshift pdf. also introduced new metric, $i$-score, efficiently incorporates different metrics, making easier compare different results (from different parameters different photometric redshift codes). find using spherical topology mapping obtain better representation underlying multidimensional topology, provides accurate results comparable other, state-of-the-art machine learning algorithms. results illustrate unsupervised approaches great potential many astronomical problems, particular computation photometric redshifts.",0 "japanese word sense disambiguation based examples synonyms. (this abstract): language japanese. printer fonts japases characters, characters figures printed correctly. dissertation bachelor's degree kyoto university(nagao lab.),march 1994.",1 "predicting face recognition performance using image quality. paper proposes data driven model predict performance face recognition system based image quality features. model relationship image quality features (e.g. pose, illumination, etc.) recognition performance measures using probability density function. address issue limited nature practical training data inherent data driven models, developed bayesian approach model distribution recognition performance measures small regions quality space. since model based solely image quality features, predict performance even actual recognition taken place. evaluate performance predictive capabilities proposed model six face recognition systems (two commercial four open source) operating three independent data sets: multipie, frgc cas-peal. results show proposed model accurately predict performance using accurate unbiased image quality assessor (iqa). furthermore, experiments highlight impact unaccounted quality space -- image quality features considered iqa -- contributing performance prediction errors.",3 "qed many-body theory worldlines: i. general formalism infrared structure. discuss reformulation qed matter gauge fields integrated explicitly, resulting many-body lorentz covariant theory 0+1 dimensional worldlines describing super-pairs spinning charges interacting lorentz forces. provides powerful, string inspired definition amplitudes loop orders. particular, one obtains general formulation wilson loops lines, exponentiated dynamical fields spin precession contributions, worldline contour averages exactly defined first quantized path integrals. discuss detail attractive features formalism high order perturbative computations. show worldline s-matrix elements, loop orders perturbation theory, constructed manifestly free soft singularities, infrared (ir) divergences captured removed endpoint photon exchanges infinity equivalent soft coherent dressings dyson s-matrix proposed faddeev kulish. discuss ir structures make connections soft theorems, abelian exponentiation ir divergences cusp anomalous dimensions. follow-up papers discuss efficient computation cusp anomalous dimensions universal features soft theorems.",9 "obstacle avoidance deep networks based intermediate perception. obstacle avoidance monocular images challenging problem robots. though multi-view structure-from-motion could build 3d maps, robust textureless environments. learning based methods exploit human demonstration predict steering command directly single image. however, method usually biased towards certain tasks demonstration scenarios also biased human understanding. paper, propose new method predict trajectory images. train system diverse nyuv2 dataset. ground truth trajectory computed designed cost functions automatically. convolutional neural network perception divided two stages: first, predict depth map surface normal rgb images, two important geometric properties related 3d obstacle representation. second, predict trajectory depth normal. results show intermediate perception increases accuracy 20% direct prediction. model generalizes well public indoor datasets also demonstrated robot flights simulation experiments.",3 "acquiring knowledge evaluation teachers performance higher education using questionnaire. paper, present step step knowledge acquisition process choosing structured method using questionnaire knowledge acquisition tool. want depict problem domain as, evaluate teachers performance higher education use expert system technology. problem acquire specific knowledge selected problem efficiently effectively human experts encode suitable computer format. acquiring knowledge human experts process expert systems development one common problems cited till yet. questionnaire sent 87 domain experts within public private universities pakistani. among 25 domain experts sent valuable opinions. domain experts highly qualified, well experienced highly responsible persons. whole questionnaire divided 15 main groups factors, divided 99 individual questions. facts analyzed give final shape questionnaire. knowledge acquisition technique may used learning tool research work.",3 "fast accurate reconstruction compressed color light field. light field photography studied thoroughly recent years. one drawbacks need multilens imaging. compensate that, compressed light field photography proposed tackle tradeoffs spatial angular resolutions. obtains using one lens, compressed version regular multi-lens system. acquisition system consists dedicated hardware followed decompression algorithm, usually suffers high computational time. work, suggest compress color channels well propose computationally efficient neural network achieves high-quality color light field reconstruction single coded image. approach outperforms existing solutions terms recovery quality computational complexity. also present neural network depth map extraction decompressed light field, trained unsupervised way without ground truth depth map.",3 "multi-wavelength study blazar 4c +01.02 long-term flaring activity 2014-2017. conducted detailed long-term spectral temporal study flat spectrum radio quasar 4c +01.02, using multi-wavelength observations fermi-lat, swift-xrt, swift-uvot. $2$-day bin $\gamma$-ray lightcurve 2014-2017 active state displays $14$ peak structures maximum integral flux $(\rm e > 100 \ mev)$ $\rm (2.5 \pm 0.2) \times 10^{-6}\ ph\ cm^{-2}\ s^{-1}$ mjd 57579.1, approximately $61$ times higher base flux $\rm (4.1 \pm 0.3) \times 10^{-8}\ ph\ cm^{-2}\ s^{-1}$, calculated averaging flux points source quiescent state. shortest $\gamma$-ray variability $0.66 \pm 0.08$ days observed source. correlation study $\gamma$-ray spectral index flux suggests source deviates usual trend harder brighter feature shown blazars. understand likely physical scenario responsible flux variation, performed detailed broadband spectral analysis source selecting different flux states multi-wavelength lightcurve. single zone leptonic model able reproduce broadband spectral energy distribution (sed) state. parameters model flux state determined using $\chi^2$ fit. observed synchrotron, synchrotron-self-compton (ssc), external-compton (ec) processes produce broadband sed varied flux states. adjoining contribution seed photons broad-line region (blr) ir torus ec process required provide adequate fits gev spectrum chosen states.",0 "learning 2d gabor filters infinite kernel learning regression. gabor functions wide-spread applications image processing computer vision. paper, prove 2d gabor functions translation-invariant positive-definite kernels propose novel formulation problem image representation gabor functions based infinite kernel learning regression. using formulation, obtain support vector expansion image based mixture gabor functions. problem representation gabor functions present support vector pixels. applying lasso support vector expansion, obtain sparse representation gabor function positioned small set pixels. application, introduce method learning dataset-specific set gabor filters used subsequently feature extraction. experiments show use learned gabor filters improves recognition accuracy recently introduced face recognition algorithm.",3 "evolutionary design photometric systems application gaia. designing photometric system best fulfil set scientific goals complex task, demanding compromise conflicting requirements subject various constraints. specific example determination stellar astrophysical parameters (aps) - effective temperature, metallicity etc. - across wide range stellar types. present novel approach problem makes minimal assumptions required filter system. considering filter system set free parameters may designed optimizing figure-of-merit (fom) respect parameters. example considered, fom measure well filter system `separate' stars different aps. separation vectorial nature, sense local directions ap variance preferably mutually orthogonal avoid ap degeneracy. optimization carried evolutionary algorithm, uses principles evolutionary biology search parameter space. model, hfd (heuristic filter design), applied design photometric systems gaia space astrometry mission. optimized systems show number interesting features, least persistence broad, overlapping filters. hfd systems perform least well proposed systems gaia, although inadequacies remain all. principles underlying hfd quite generic may applied filter design numerous projects, search specific types objects photometric redshift determination.",0 "practical reasoning expressive description logics. description logics (dls) family knowledge representation formalisms mainly characterised constructors build complex concepts roles atomic ones. expressive role constructors important many applications, computationally problematical. present algorithm decides satisfiability dl alc extended transitive inverse roles, role hierarchies, qualifying number restrictions. early experiments indicate algorithm well-suited implementation. additionally, show alc extended transitive inverse roles still pspace. finally, investigate limits decidability family dls.",3 "automatic labeling molecular biomarkers whole slide immunohistochemistry images using fully convolutional networks. paper addresses problem quantifying biomarkers multi-stained tissues, based color spatial information. deep learning based method automatically localize quantify cells expressing biomarker(s) whole slide image proposed. deep learning network fully convolutional network (fcn) whose input true rgb color image tissue output map different biomarkers. fcn relies convolutional neural network (cnn) classifies cell separately according biomarker expresses. study, images immunohistochemistry (ihc) stained slides collected used. 4,500 rgb images cells manually labeled based expressing biomarkers. labeled cell images used train cnn (obtaining accuracy 92% test set). trained cnn extended fcn generates map biomarkers whole slide image acquired scanner (instead classifying every cell image). evaluate method, manually labeled nuclei expressing different biomarkers two whole slide images used theses ground truth. proposed method immunohistochemical analysis compares well manual labeling humans (average f-score 0.96).",15 "collective prediction individual mobility traces exponential weights. present test sequential learning algorithm short-term prediction human mobility. novel approach pairs exponential weights forecaster large ensemble experts. experts individual sequence prediction algorithms constructed mobility traces 10 million roaming mobile phone users european country. average prediction accuracy significantly higher individual sequence prediction algorithms, namely constant order markov models derived user's data, shown achieve high accuracy previous studies human mobility prediction. algorithm uses time stamped location data, accuracy depends completeness expert ensemble, contain redundant records typical mobility patterns. proposed algorithm applicable prediction sufficiently large dataset sequences.",14 "data-driven approach semantic role labeling induced grammar structures language. semantic roles play important role extracting knowledge text. current unsupervised approaches utilize features grammar structures, induce semantic roles. dependence grammars, however, makes difficult adapt noisy new languages. paper develop data-driven approach identifying semantic roles, approach entirely unsupervised point rules need learned identify position semantic role occurs. specifically develop modified-adios algorithm based adios solan et al. (2005) learn grammar structures, use grammar structures learn rules identifying semantic roles based context grammar structures appeared. results obtained comparable current state-of-art models inherently dependent human annotated data.",3 "deep convolutional neural networks microscopy-based point care diagnostics. point care diagnostics using microscopy computer vision methods applied number practical problems, particularly relevant low-income, high disease burden areas. however, subject limitations sensitivity specificity computer vision methods used. general, deep learning recently revolutionised field computer vision, cases surpassing human performance object recognition tasks. paper, evaluate performance deep convolutional neural networks three different microscopy tasks: diagnosis malaria thick blood smears, tuberculosis sputum samples, intestinal parasite eggs stool samples. cases accuracy high substantially better alternative approach representative traditional medical imaging techniques.",3 "concurrent learning based adaptive control euler lagrange systems guaranteed parameter convergence. work presents solution adaptive tracking control euler lagrange systems guaranteed tracking parameter estimation error convergence. specifically concurrent learning based update rule fused filtered version desired system dynamics conjunction desired state based regression matrix utilized ensure position tracking error parameter estimation error terms converge origin exponentially. regression matrix used proposed controller makes use desired versions system states, initial, sufficiently exciting memory stack formed knowledge desired system trajectory priori, thus removing initial excitation condition required previously proposed concurrent learning based controllers literature. output feedback versions proposed method position measurements available controller design, (for gradient composite type adaptions) also presented order illustrate modularity proposed method. stability boundedness closed loop signals proposed controllers ensured via lyapunov based analysis. %trajectory tracking control class fully actuated euler lagrange systems considered work. system dynamics considered subject parametric uncertainties on--line identification uncertain model parameters also aimed. compared relevant past research, via novel approach, desired states proposed used forming regression matrix desired compensation based concurrent learning type adaptive update rule designed. via utilizing novel lyapunov analysis, semi--global exponential convergence tracking parameter identification error origin ensured.",5 "automatic method finding topic boundaries. article outlines new method locating discourse boundaries based lexical cohesion graphical technique called dotplotting. application dotplotting discourse segmentation performed either manually, examining graph, automatically, using optimization algorithm. results two experiments involving automatically locating boundaries series concatenated documents presented. areas application future directions work also outlined.",1 "benefits output sparsity multi-label classification. multi-label classification framework, observation associated set labels, generated tremendous amount attention recent years. modern multi-label problems typically large-scale terms number observations, features labels, amount labels even comparable amount observations. context, different remedies proposed overcome curse dimensionality. work, aim exploiting output sparsity introducing new loss, called sparse weighted hamming loss. proposed loss seen weighted version classical ones, active inactive labels weighted separately. leveraging influence sparsity loss function, provide improved generalization bounds empirical risk minimizer, suitable property large-scale problems. new loss, derive rates convergence linear underlying output-sparsity rather linear number labels. practice, minimizing associated risk performed efficiently using convex surrogates modern convex optimization algorithms. provide experiments various real-world datasets demonstrating pertinence approach compared non-weighted techniques.",10 "approximate inference amortised mcmc. propose novel approximate inference algorithm approximates target distribution amortising dynamics user-selected mcmc sampler. idea initialise mcmc using samples approximation network, apply mcmc operator improve samples, finally use samples update approximation network thereby improving quality. provides new generic framework approximate inference, allowing us deploy highly complex, implicitly defined approximation families intractable densities, including approximations produced warping source randomness deep neural network. experiments consider image modelling deep generative models challenging test method. deep models trained using amortised mcmc shown generate realistic looking samples well producing diverse imputations images regions missing pixels.",18 "quantum thermodynamic uncertainties nonequilibrium systems robertson-schrödinger relations. thermodynamic uncertainty principles make one rare anchors largely uncharted waters nonequilibrium systems, fluctuation theorems familiar. work aim trace uncertainties thermodynamic quantities nonequilibrium systems quantum origins, namely, quantum uncertainty principles. results enable us make categorical statement: gaussian systems, thermodynamic functions functionals robertson-schrodinger uncertainty function, always non-negative quantum systems, necessarily classical systems. here, quantum refers noncommutativity canonical operator pairs. nonequilibrium free energy[1], succeeded deriving several inequalities certain thermodynamic quantities. assume forms conventional thermodynamics, nonequilibrium nature hold times strong coupling. addition show fluctuation-dissipation inequality exists times nonequilibrium dynamics system. nonequilibrium systems relax equilibrium state late times, fluctuation-dissipation inequality leads robertson-schrodinger uncertainty principle help cauchy-schwarz inequality. work provides microscopic quantum basis certain important thermodynamic properties macroscopic nonequilibrium systems.",2 "accelerating deep learning memcomputing. restricted boltzmann machines (rbms) extensions, called 'deep-belief networks', powerful neural networks found applications fields machine learning big data. standard way training models resorts iterative unsupervised procedure based gibbs sampling, called 'contrastive divergence' (cd), additional supervised tuning via back-propagation. however, procedure shown follow gradient lead suboptimal solutions. paper, show efficient alternative cd means simulations digital memcomputing machines (dmms). test approach pattern recognition using modified version mnist data set. dmms sample effectively vast phase space given model distribution rbm, provide good approximation close optimum. efficient search significantly reduces number pretraining iterations necessary achieve given level accuracy, well total performance gain cd. fact, acceleration pretraining achieved simulating dmms comparable to, number iterations, recently reported hardware application quantum annealing method network data set. notably, however, dmms perform far better reported quantum annealing results terms quality training. also compare method advances supervised training, like batch-normalization rectifiers, work reduce advantage pretraining. find memcomputing method still maintains quality advantage ($>1\%$ accuracy, $20\%$ reduction error rate) approaches. furthermore, method agnostic connectivity network. therefore, extended train full boltzmann machines, even deep networks once.",3 "contextual regression: accurate conveniently interpretable nonlinear model mining discovery scientific data. machine learning algorithms linear regression, svm neural network played increasingly important role process scientific discovery. however, none interpretable accurate nonlinear datasets. present contextual regression, method joins two desirable properties together using hybrid architecture neural network embedding dot product layer. demonstrate high prediction accuracy sensitivity task predictive feature selection simulated dataset application predicting open chromatin sites human genome. simulated data, method achieved high fidelity recovery feature contributions random noise levels 200%. open chromatin dataset, application method outperformed state art method terms accuracy, also unveiled two previously unfound open chromatin related histone marks. method fill blank accurate interpretable nonlinear modeling scientific data mining tasks.",15 "retrodictive quantum computing. quantum models computation widely believed powerful classical ones. efforts center proving that, given problem, quantum algorithms resource efficient classical one. this, however, assumes standard predictive paradigm reasoning where, given initial conditions, future holds answer. bringing information future present exploit one's advantage? radical new approach reasoning, so-called retrodictive computation, benefits specific form computed functions. demonstrate use tools symbolic computation realize retrodictive quantum computing scale exploit efficiently, classically, solve instances quantum deutsch-jozsa, bernstein-vazirani, simon, grover, shor's algorithms.",17 "simulator: understanding adaptive sampling moderate-confidence regime. propose novel technique analyzing adaptive sampling called {\em simulator}. approach differs existing methods considering much information could gathered fixed sampling strategy, difficult distinguish good sampling strategy bad one given limited amount data collected given time. change perspective allows us match strength fano change-of-measure techniques, without succumbing limitations either method. concreteness, apply techniques structured multi-arm bandit problem fixed-confidence pure exploration setting, show constraints means imply substantial gap moderate-confidence sample complexity, asymptotic sample complexity $\delta \to 0$ found literature. also prove first instance-based lower bounds top-k problem incorporate appropriate log-factors. moreover, lower bounds zero-in number times \emph{individual} arm needs pulled, uncovering new phenomena drowned aggregate sample complexity. new analysis inspires simple near-optimal algorithm best-arm top-k identification, first {\em practical} algorithm kind latter problem removes extraneous log factors, outperforms state-of-the-art experiments.",3 "spectral community detection heterogeneous large networks. article, study spectral methods community detection based $ \alpha$-parametrized normalized modularity matrix hereafter called $ {\bf l}_\alpha $ heterogeneous graph models. show, regime community detection asymptotically trivial, $ {\bf l}_\alpha $ well approximated tractable random matrix falls family spiked random matrices. analysis equivalent spiked random matrix allows us improve spectral methods community detection assess performances regime study. particular, prove existence optimal value $ \alpha_{\rm opt} $ parameter $ \alpha $ detection communities best ensured provide on-line estimation $ \alpha_{\rm opt} $ based knowledge graph adjacency matrix. unlike classical spectral methods community detection clustering performed eigenvectors associated extreme eigenvalues, show theoretical analysis regularization instead performed eigenvectors prior clustering heterogeneous graphs. finally, deeper study regularized eigenvectors used clustering, assess performances new algorithm community detection. numerical simulations course article show methods outperform state-of-the-art spectral methods dense heterogeneous graphs.",18 "initialization multilayer forecasting artifical neural networks. paper, new method developed initialising artificial neural networks predicting dynamics time series. initial weighting coefficients determined neurons analogously case linear prediction filter. moreover, improve accuracy initialization method multilayer neural network, variants decomposition transformation matrix corresponding linear prediction filter suggested. efficiency proposed neural network prediction method forecasting solutions lorentz chaotic system shown paper.",3 "cgan-based manga colorization using single training image. japanese comic format known manga popular world. traditionally produced black white, colorization time consuming costly. automatic colorization methods generally rely greyscale values, present manga. furthermore, due copyright protection, colorized manga available training scarce. propose manga colorization method based conditional generative adversarial networks (cgan). unlike previous cgan approaches use many hundreds thousands training images, method requires single colorized reference image training, avoiding need large dataset. colorizing manga using cgans produce blurry results artifacts, resolution limited. therefore also propose method segmentation color-correction mitigate issues. final results sharp, clear, high resolution, stay true character's original color scheme.",3 "snake: stochastic proximal gradient algorithm regularized problems large graphs. regularized optimization problem large unstructured graph studied, regularization term tied graph geometry. typical regularization examples include total variation laplacian regularizations graph. applying proximal gradient algorithm solve problem, exist quite affordable methods implement proximity operator (backward step) special case graph simple path without loops. paper, algorithm, referred ""snake"", proposed solve regularized problems general graphs, taking benefit fast methods. algorithm consists properly selecting random simple paths graph performing proximal gradient algorithm simple paths. algorithm instance new general stochastic proximal gradient algorithm, whose convergence proven. applications trend filtering graph inpainting provided among others. numerical experiments conducted large graphs.",10 "language distribution prediction based batch markov monte carlo simulation migration. language spreading complex mechanism involves issues like culture, economics, migration, population etc. paper, propose set methods model dynamics spreading system. model randomness language spreading, propose batch markov monte carlo simulation migration(bmmcsm) algorithm, agent treated language stack. agent learns languages migrates based proposed batch markov property according transition matrix migration matrix m. since population plays crucial role language spreading, also introduce mortality fertility mechanism, controls birth death simulated agents, bmmcsm algorithm. simulation results bmmcsm show numerical geographic distribution languages varies across time. change distribution fits world cultural economic development trend. next, construct matrix t, entries directly calculated historical statistics entries unknown. thus, key success bmmcsm lies accurate estimation transition matrix estimating unknown entries supervision known entries. achieve this, first construct 20 20 5 factor tensor x characterize entry t. train random forest regressor known entries use trained regressor predict unknown entries. reason choose random forest(rf) that, compared single decision tree, conquers problem fitting shapiro test also suggests residual rf subjects normal distribution.",3 "identifying best interventions online importance sampling. motivated applications computational advertising systems biology, consider problem identifying best several possible soft interventions source node $v$ acyclic causal directed graph, maximize expected value target node $y$ (located downstream $v$). setting imposes fixed total budget sampling various interventions, along cost constraints different types interventions. pose best arm identification bandit problem $k$ arms arm soft intervention $v,$ leverage information leakage among arms provide first gap dependent error simple regret bounds problem. results significant improvement traditional best arm identification results. empirically show algorithms outperform state art flow cytometry data-set, also apply algorithm model interpretation inception-v3 deep net classifies images.",18 "symmetric tensor completion multilinear entries learning product mixtures hypercube. give algorithm completing order-$m$ symmetric low-rank tensor multilinear entries time roughly proportional number tensor entries. apply tensor completion algorithm problem learning mixtures product distributions hypercube, obtaining new algorithmic results. centers product distribution linearly independent, recover distributions many $\omega(n)$ centers polynomial time sample complexity. general case, recover distributions many $\tilde\omega(n)$ centers quasi-polynomial time, answering open problem feldman et al. (siam j. comp.) special case distributions incoherent bias vectors. main algorithmic tool iterated application low-rank matrix completion algorithm matrices adversarially missing entries.",3 "efficient pseudo-likelihood method sparse binary pairwise markov network estimation. pseudo-likelihood method one popular algorithms learning sparse binary pairwise markov networks. paper, formulate $l_1$ regularized pseudo-likelihood problem sparse multiple logistic regression problem. way, many insights optimization procedures sparse logistic regression applied learning discrete markov networks. specifically, use coordinate descent algorithm generalized linear models convex penalties, combined strong screening rules, solve pseudo-likelihood problem $l_1$ regularization. therefore substantial speedup without losing accuracy achieved. furthermore, method stable node-wise logistic regression approach unbalanced high-dimensional data penalized small regularization parameters. thorough numerical experiments simulated data real world data demonstrate advantages proposed method.",18 interpreting extracted rules ensemble trees: application computer-aided diagnosis breast mri. high predictive performance ease use interpretability important requirements applicability computer-aided diagnosis (cad) human reading studies. propose cad system specifically designed comprehensible radiologist reviewing screening breast mri studies. multiparametric imaging features combined produce cad system differentiating cancerous non-cancerous lesions. complete system uses rule-extraction algorithm present lesion classification results easy understand graph visualization.,18 "search higgs boson decay charm quark-antiquark pair proton-proton collisions $\sqrt{s}$ = 13 tev. search standard model higgs boson decaying charm quark-antiquark pair, h $\to \mathrm{c\bar{c}}$, produced association leptonically decaying v (w z) boson presented. search performed proton-proton collisions $\sqrt{s}$ = 13 tev collected cms experiment, corresponding integrated luminosity 138 fb$^{-1}$. novel charm jet identification analysis methods using machine learning techniques employed. analysis validated searching z $\to \mathrm{c\bar{c}}$ vz events, leading first observation hadron collider significance 5.7 standard deviations. observed (expected) upper limit $\sigma$(vh)$\mathcal{b}$(h$\to \mathrm{c\bar{c}}$) 0.94 (0.50$^{+0.22}_{-0.15}$) pb 95% confidence level (cl), corresponding 14 (7.6$^{+3.4}_{-2.3}$) times standard model prediction. higgs-charm yukawa coupling modifier, $\kappa_\mathrm{c}$, observed (expected) 95% cl interval 1.1 $\lt\vert\kappa_\mathrm{c}\vert \lt$ 5.5 ($\vert\kappa_\mathrm{c}\vert\lt$ 3.4), stringent constraint date.",7 "onsets frames: dual-objective piano transcription. consider problem transcribing polyphonic piano music emphasis generalizing unseen instruments. use deep neural networks propose novel approach predicts onsets frames using cnns lstms. model predicts pitch onset events uses predictions condition framewise pitch predictions. inference, restrict predictions framewise detector allowing new note start unless onset detector also agrees onset pitch present frame. focus improving onsets offsets together instead either isolation believe correlates better human musical perception. technique results 100% relative improvement note offset score maps dataset.",3 "time-varying study chinese investor sentiment, stock market liquidity volatility: based deep learning bert model tvp-var model. based commentary data shenzhen stock index bar eastmoney website january 1, 2018 december 31, 2019. paper extracts embedded investor sentiment using deep learning bert model investigates time-varying linkage investment sentiment, stock market liquidity volatility using tvp-var model. results show impact investor sentiment stock market liquidity volatility stronger. although inverse effect relatively small, pronounced state stock market. cases, response pronounced short term medium long term, impact asymmetric, shocks stronger market downward spiral.",16 free form based active contours image segmentation free space perception. paper present novel approach representing evolving deformable active contours. method combines piecewise regular b{\'e}zier models curve evolution defined local free form deformation. contour deformation locally constrained allows contour convergence almost linear complexity adapting various shape settings handling topology changes active contour. demonstrate effectiveness new active contour scheme visual free space perception segmentation using omnidirectional images acquired robot exploring unknown indoor outdoor environments. several experiments validate approach comparison state-of-the art parametric geometric active contours provide fast real-time robot free space segmentation navigation.,3 "deep gradient compression: reducing communication bandwidth distributed training. large-scale distributed training requires significant communication bandwidth gradient exchange limits scalability multi-node training, requires expensive high-bandwidth network infrastructure. situation gets even worse distributed training mobile devices (federated learning), suffers higher latency, lower throughput, intermittent poor connections. paper, find 99.9% gradient exchange distributed sgd redundant, propose deep gradient compression (dgc) greatly reduce communication bandwidth. preserve accuracy compression, dgc employs four methods: momentum correction, local gradient clipping, momentum factor masking, warm-up training. applied deep gradient compression image classification, speech recognition, language modeling multiple datasets including cifar10, imagenet, penn treebank, librispeech corpus. scenarios, deep gradient compression achieves gradient compression ratio 270x 600x without losing accuracy, cutting gradient size resnet-50 97mb 0.35mb, deepspeech 488mb 0.74mb. deep gradient compression enables large-scale distributed training inexpensive commodity 1gbps ethernet facilitates distributed training mobile.",3 "category space approach supervised dimensionality reduction. supervised dimensionality reduction emerged important theme last decade. despite plethora models formulations, lack simple model aims project set patterns space defined classes (or categories). end, set model class represented 1d subspace vector space formed features. assuming set classes exceed cardinality features, model results multi-class supervised learning features class projected class subspace. class discrimination automatically guaranteed via imposition orthogonality 1d class sub-spaces. resulting optimization problem - formulated minimization sum quadratic functions stiefel manifold - non-convex (due constraints), nevertheless structure identify reached global minimum. formulating version standard inner products, extend formulation reproducing kernel hilbert spaces straightforward manner. optimization approach also extends similar fashion kernel version. results comparisons multi-class fisher linear (and kernel) discriminants principal component analysis (linear kernel) showcase relative merits approach dimensionality reduction.",18 "parallel distributed thompson sampling large-scale accelerated exploration chemical space. chemical space large brute force searches new interesting molecules infeasible. high-throughput virtual screening via computer cluster simulations speed discovery process collecting large amounts data parallel, e.g., hundreds thousands parallel measurements. bayesian optimization (bo) produce additional acceleration sequentially identifying useful simulations experiments performed next. however, current bo methods cannot scale large numbers parallel measurements massive libraries molecules currently used high-throughput screening. here, propose scalable solution based parallel distributed implementation thompson sampling (pdts). show that, small scale problems, pdts performs similarly parallel expected improvement (ei), batch version widely used bo heuristic. additionally, settings parallel ei scale, pdts outperforms scalable baselines greedy search, $\epsilon$-greedy approaches random search method. results show pdts successful solution large-scale parallel bo.",18 "leverage, influence, jackknife clustered regression models: reliable inference using summclust. cluster-robust inference widely used modern empirical work economics many disciplines. data clustered, key unit observation cluster. propose measures ""high-leverage"" clusters ""influential"" clusters linear regression models. measures leverage partial leverage, functions them, used diagnostic tools identify datasets regression designs cluster-robust inference likely challenging. measures influence provide valuable information results depend data various clusters. also show calculate two jackknife variance matrix estimators, cv3 cv3j, byproduct computations. quantities, including jackknife variance estimators, computed new stata package called summclust summarizes cluster structure dataset.",4 "deep supervised learning using local errors. error backpropagation highly effective mechanism learning high-quality hierarchical features deep networks. updating features weights one layer, however, requires waiting propagation error signals higher layers. learning using delayed non-local errors makes hard reconcile backpropagation learning mechanisms observed biological neural networks requires neurons maintain memory input long enough higher-layer errors arrive. paper, propose alternative learning mechanism errors generated locally layer using fixed, random auxiliary classifiers. lower layers could thus trained independently higher layers training could either proceed layer layer, simultaneously layers using local error information. address biological plausibility concerns weight symmetry requirements show proposed learning mechanism based fixed, broad, random tuning neuron classification categories outperforms biologically-motivated feedback alignment learning technique mnist, cifar10, svhn datasets, approaching performance standard backpropagation. approach highlights potential biological mechanism supervised, task-dependent, learning feature hierarchies. addition, show well suited learning deep networks custom hardware drastically reduce memory traffic data communication overheads.",3 "hybrid word-character model abstractive summarization. abstractive summarization popular research topic nowadays. due difference language property, chinese summarization also gains lots attention. studies use character-based representation instead word-based keep error introduced word segmentation oov problem. however, believe word-based representation capture semantics articles accurately. proposed hybrid word-character model preserves advantage word-based character-based representations. method also enables us use larger word vocabulary size anyone else. call new method hwc (hybrid word-character). conduct experiments lcsts chinese summarization dataset, out-perform current state-of-the-art least 8 rouge points.",3 "part-of-speech tagging two sequential transducers. present method constructing using cascade consisting left- right-sequential finite-state transducer (fst), t1 t2, part-of-speech (pos) disambiguation. compared hmm, fst cascade advantage significantly higher processing speed, cost slightly lower accuracy. applications information retrieval, speed important accuracy, could benefit approach. process tagging, first assign every word unique ambiguity class c_i looked lexicon encoded sequential fst. every c_i denoted single symbol, e.g. [adj_noun], although represents set alternative tags given word occur with. sequence c_i words one sentence input fst cascade. mapped t1, left right, sequence reduced ambiguity classes r_i. every r_i denoted single symbol, although represents set alternative tags. intuitively, t1 eliminates less likely tags c_i, thus creating r_i. finally, t2 maps sequence r_i, right left, sequence single pos tags t_i. intuitively, t2 selects likely t_i every r_i. probabilities t_i, r_i, c_i used compile time, run time. (directly) occur fsts, ""implicitly contained"" structure.",3 "online learning predictable sequences. present methods online linear optimization take advantage benign (as opposed worst-case) sequences. specifically sequence encountered learner described well known ""predictable process"", algorithms presented enjoy tighter bounds compared typical worst case bounds. additionally, methods achieve usual worst-case regret bounds sequence benign. approach seen way adding prior knowledge sequence within paradigm online learning. setting shown encompass partial side information. variance path-length bounds seen particular examples online learning simple predictable sequences. extend methods results include competing set possible predictable processes (models), ""learning"" predictable process concurrently using obtain better regret guarantees. show model selection possible various assumptions available feedback. results suggest promising direction research potential applications stock market time series prediction.",18 "brundlefly semeval-2016 task 12: recurrent neural networks vs. joint inference clinical temporal information extraction. submitted two systems semeval-2016 task 12: clinical tempeval challenge, participating phase 1, identified text spans time event expressions clinical notes phase 2, predicted relation event parent document creation time. temporal entity extraction, find joint inference-based approach using structured prediction outperforms vanilla recurrent neural network incorporates word embeddings trained variety large clinical document sets. document creation time relations, find combination date canonicalization distant supervision rules predicting relations events time expressions improves classification, though gains limited, likely due small scale training data.",3 "excitation-damping quantum channels. study class quantum channels describing quantum system, split direct sum excited ground sector, undergoing one-way transfer population former latter; construction, provides generalization amplitude-damping qubit channel, regarded way upgrade trace non-increasing quantum operation, defined excited sector, possibly trace preserving operation larger hilbert space. provide necessary sufficient conditions complete positivity channels, also show complete positivity equivalent simple positivity whenever ground sector one-dimensional. finally, examine time-dependent scenario characterize cp-divisible channels markovian semigroups belonging class.",17 "proceedings twenty-eighth conference uncertainty artificial intelligence (2012). proceedings twenty-eighth conference uncertainty artificial intelligence, held catalina island, ca august 14-18 2012.",3 "neural representation sketch drawings. present sketch-rnn, recurrent neural network (rnn) able construct stroke-based drawings common objects. model trained thousands crude human-drawn images representing hundreds classes. outline framework conditional unconditional sketch generation, describe new robust training methods generating coherent sketch drawings vector format.",3 "void hubble tension? end line hubble bubble. universe may feature large-scale inhomogeneities beyond standard paradigm, implying statistical homogeneity isotropy may reached much larger scales usually assumed $\sim$100 mpc. means necessarily typical observers copernican principle could recovered super-hubble scales. here, assume validity copernican principle let cosmic microwave background, baryon acoustic oscillations, type ia supernovae, local $h_0$, cosmic chronometers, compton y-distortion kinetic sunyaev-zeldovich observations constrain geometrical degrees freedom local structure, parametrize via $\lambda$ltb model -- basically non-linear radial perturbation flrw metric. order quantify non-copernican structure could explain away hubble tension, pay careful attention computing hubble constant inhomogeneous universe, adopt model selection via bayes factor akaike information criterion. results show that, $\lambda$ltb model successfully explain away $h_0$ tension, favored respect $\lambda$cdm model one solely considers supernovae redshift range used fit hubble constant, is, $0.023