full_info,tags "intrusion detection systems using adaptive regression splines. past years witnessed growing recognition intelligent techniques construction efficient reliable intrusion detection systems. due increasing incidents cyber attacks, building effective intrusion detection systems (ids) essential protecting information systems security, yet remains elusive goal great challenge. paper, report performance analysis multivariate adaptive regression splines (mars), neural networks support vector machines. mars procedure builds flexible regression models fitting separate splines distinct intervals predictor variables. brief comparison different neural network learning algorithms also given.",4 "deep value networks learn evaluate iteratively refine structured outputs. approach structured output prediction optimizing deep value network (dvn) precisely estimate task loss different output configurations given input. model trained, perform inference gradient descent continuous relaxations output variables find outputs promising scores value network. applied image segmentation, value network takes image segmentation mask inputs predicts scalar estimating intersection union input ground truth masks. multi-label classification, dvn's objective correctly predict f1 score potential label configuration. dvn framework achieves state-of-the-art results multi-label prediction image segmentation benchmarks.",4 "financial portfolio optimization: computationally guided agents investigate, analyse invest!?. financial portfolio optimization widely studied problem mathematics, statistics, financial computational literature. adheres determining optimal combination weights associated financial assets held portfolio. practice, faces challenges virtue varying math. formulations, parameters, business constraints complex financial instruments. empirical nature data longer one-sided; thereby reflecting upside downside trends repeated yet unidentifiable cyclic behaviours potentially caused due high frequency volatile movements asset trades. portfolio optimization circumstances theoretically computationally challenging. work presents novel mechanism reach optimal solution encoding variety optimal solutions solution bank guide search process global investment objective formulation. conceptualizes role individual solver agents contribute optimal solutions bank solutions, super-agent solver learns solution bank, and, thus reflects knowledge-based computationally guided agents approach investigate, analyse reach optimal solution informed investment decisions. conceptual understanding classes solver agents represent varying problem formulations and, mathematically oriented deterministic solvers along stochastic-search driven evolutionary swarm-intelligence based techniques optimal weights discussed. algorithmic implementation presented enhanced neighbourhood generation mechanism simulated annealing algorithm. framework inclusion heuristic knowledge human expertise financial literature related investment decision making process reflected via introduction controlled perturbation strategies using decision matrix neighbourhood generation.",17 "adaptive visualisation system construction building information models using saliency. building information modeling (bim) recent construction process based 3d model, containing every component related building achievement. architects, structure engineers, method engineers, others participant building process work model design-to-construction cycle. high complexity large amount information included models raise several issues, delaying wide adoption industrial world. one important visualization: professionals difficulties find relevant information job. actual solutions suffer two limitations: bim models information processed manually insignificant information simply hidden, leading inconsistencies building model. paper describes system relying ontological representation building information label automatically building elements. depending user's department, visualization modified according labels automatically adjusting colors image properties based saliency model. proposed saliency model incorporates several adaptations fit specificities architectural images.",4 "making sensitivity analysis computationally efficient. investigate robustness output probabilities bayesian network, sensitivity analysis performed. one-way sensitivity analysis establishes, probability parameters network, function expressing posterior marginal probability interest terms parameter. current methods computing coefficients function rely large number network evaluations. paper, present method requires single outward propagation junction tree establishing coefficients functions possible parameters; addition, inward propagation required processing evidence. conversely, method requires single outward propagation computing coefficients functions expressing possible posterior marginals terms single parameter. extend results n-way sensitivity analysis sets parameters studied.",4 "explanation mechanism bayesian inferencing systems. explanation facilities particularly important feature expert system frameworks. area traditional rule-based expert system frameworks mixed results. explanations control well handled, facilities needed generating better explanations concerning knowledge base content. paper approaches explanation problem examining effect event variable interest within symmetric bayesian inferencing system. argue effect measure operating context must satisfy certain properties. measure proposed. forms basis explanation facility allows user generalized bayesian inferencing system question meaning knowledge base. facility described detail.",4 "foodnet: recognizing foods using ensemble deep networks. work propose methodology automatic food classification system recognizes contents meal images food. developed multi-layered deep convolutional neural network (cnn) architecture takes advantages features deep networks improves efficiency. numerous classical handcrafted features approaches explored, among cnns chosen best performing features. networks trained fine-tuned using preprocessed images filter outputs fused achieve higher accuracy. experimental results largest real-world food recognition database eth food-101 newly contributed indian food image database demonstrate effectiveness proposed methodology compared many benchmark deep learned cnn frameworks.",4 "natural language emerge 'naturally' multi-agent dialog. number recent works proposed techniques end-to-end learning communication protocols among cooperative multi-agent populations, simultaneously found emergence grounded human-interpretable language protocols developed agents, learned without human supervision! paper, using task tell reference game two agents testbed, present sequence 'negative' results culminating 'positive' one -- showing agent-invented languages effective (i.e. achieve near-perfect task rewards), decidedly interpretable compositional. essence, find natural language emerge 'naturally', despite semblance ease natural-language-emergence one may gather recent literature. discuss possible coax invented languages become human-like compositional increasing restrictions two agents may communicate.",4 "landmark-guided elastic shape analysis human character motions. motions virtual characters movies video games typically generated recording actors using motion capturing methods. animations generated way often need postprocessing, improving periodicity cyclic animations generating entirely new motions interpolation existing ones. furthermore, search classification recorded motions becomes important amount recorded motion data grows. paper, apply methods shape analysis processing animations. precisely, use classical elastic metric model used shape matching, extend incorporating additional inexact feature point information, leads improved temporal alignment different animations.",4 "long-term multi-granularity deep framework driver drowsiness detection. real-world driver drowsiness detection videos, variation head pose large existing methods global face capable extracting effective features, looking aside lowering head. temporal dependencies variable length also rarely considered previous approaches, e.g., yawning speaking. paper, propose long-term multi-granularity deep framework detect driver drowsiness driving videos containing frontal faces. framework includes two key components: (1) multi-granularity convolutional neural network (mcnn), novel network utilizes group parallel cnn extractors well-aligned facial patches different granularities, extracts facial representations effectively large variation head pose, furthermore, flexibly fuse detailed appearance clues main parts local global spatial constraints; (2) deep long short term memory network applied facial representations explore long-term relationships variable length sequential frames, capable distinguish states temporal dependencies, blinking closing eyes. approach achieves 90.05% accuracy 37 fps speed evaluation set public nthu-ddd dataset, state-of-the-art method driver drowsiness detection. moreover, build new dataset named fi-ddd, higher precision drowsy locations temporal dimension.",4 "consistent query answering via asp different perspectives: theory practice. data integration system provides transparent access different data sources suitably combining data, providing user unified view them, called global schema. however, source data generally control data integration process, thus integrated data may violate global integrity constraints even presence locally-consistent data sources. scenario, may anyway interesting retrieve much consistent information possible. process answering user queries global constraint violations called consistent query answering (cqa). several notions cqa proposed, e.g., depending whether integrated information assumed sound, complete, exact variant them. paper provides contribution setting: uniforms solutions coming different perspectives common asp-based core, provides query-driven optimizations designed isolating eliminating inefficiencies general approach computing consistent answers. moreover, paper introduces new theoretical results enriching existing knowledge decidability complexity considered problems. effectiveness approach evidenced experimental results. appear theory practice logic programming (tplp).",4 "using simulated annealing calculate trembles trembling hand perfection. within literature non-cooperative game theory, number attempts propose logorithms compute nash equilibria. rather derive new algorithm, paper shows family algorithms known markov chain monte carlo (mcmc) used calculate nash equilibria. mcmc type monte carlo simulation relies markov chains ensure regularity conditions. mcmc widely used throughout statistics optimization literature, variants algorithm known simulated annealing. paper shows interesting connection trembles underlie functioning algorithm type nash refinement known trembling hand perfection.",4 "neuroevolution edge chaos. echo state networks represent special type recurrent neural networks. recent papers stated echo state networks maximize computational performance transition order chaos, so-called edge chaos. work confirms statement comprehensive set experiments. furthermore, echo state networks compared networks evolved via neuroevolution. evolved networks outperform echo state networks, however, evolution consumes significant computational resources. demonstrated echo state networks local connections combine best worlds, simplicity random echo state networks performance evolved networks. finally, shown evolution tends stay close ordered side edge chaos.",4 "general factorization framework context-aware recommendations. context-aware recommendation algorithms focus refining recommendations considering additional information, available system. topic gained lot attention recently. among others, several factorization methods proposed solve problem, although assume explicit feedback strongly limits real-world applicability. algorithms apply various loss functions optimization strategies, preference modeling context less explored due lack tools allowing easy experimentation various models. context dimensions introduced beyond users items, space possible preference models importance proper modeling largely increases. paper propose general factorization framework (gff), single flexible algorithm takes preference model input computes latent feature matrices input dimensions. gff allows us easily experiment various linear models context-aware recommendation task, explicit implicit feedback based. scaling properties makes usable real life circumstances well. demonstrate framework's potential exploring various preference models 4-dimensional context-aware problem contexts available almost real life datasets. show experiments -- performed five real life, implicit feedback datasets -- proper preference modelling significantly increases recommendation accuracy, previously unused models outperform traditional ones. novel models gff also outperform state-of-the-art factorization algorithms. also extend method fully compliant multidimensional dataspace model, one extensive data models context-enriched data. extended gff allows seamless incorporation information fac[truncated]",4 "prediction restricted resources finite automata. obtain index complexity random sequence allowing role measure classical probability theory played function call generating mechanism. typically, generating mechanism finite automata. generate set biased sequences applying finite state automata specified number, $m$, states set binary sequences. thus index complexity random sequence number states automata. detail optimal algorithms predict sequences generated way.",19 "deep neural networks stress. recent years, deep architectures used transfer learning state-of-the-art performance many datasets. properties features remain, however, largely unstudied transfer perspective. work, present extensive analysis resiliency feature vectors extracted deep models, special focus trade-off performance compression rate. introducing perturbations image descriptions extracted deep convolutional neural network, change precision number dimensions, measuring affects final score. show deep features robust disturbances compared classical approaches, achieving compression rate 98.4%, losing 0.88% original score pascal voc 2007.",4 "making neural qa simple possible simpler. recent development large-scale question answering (qa) datasets triggered substantial amount research end-to-end neural architectures qa. increasingly complex systems conceived without comparison simpler neural baseline systems would justify complexity. work, propose simple heuristic guides development neural baseline systems extractive qa task. find two ingredients necessary building high-performing neural qa system: first, awareness question words processing context second, composition function goes beyond simple bag-of-words modeling, recurrent neural networks. results show fastqa, system meets two requirements, achieve competitive performance compared existing models. argue surprising finding puts results previous systems complexity recent qa datasets perspective.",4 "translation-based constraint answer set solving. solve constraint satisfaction problems translation answer set programming (asp). reformulations property unit-propagation asp solver achieves well defined local consistency properties like arc, bound range consistency. experiments demonstrate computational value approach.",4 "api design machine learning software: experiences scikit-learn project. scikit-learn increasingly popular machine learning li- brary. written python, designed simple efficient, accessible non-experts, reusable various contexts. paper, present discuss design choices application programming interface (api) project. particular, describe simple elegant interface shared learning processing units library discuss advantages terms composition reusability. paper also comments implementation details specific python ecosystem analyzes obstacles faced users developers library.",4 "metalearning feature selection. general formulation optimization problems various candidate solutions may use different feature-sets presented, encompassing supervised classification, automated program learning cases. novel characterization concept ""good quality feature"" optimization problem provided; proposal regarding integration quality based feature selection metalearning suggested, wherein quality feature problem estimated using knowledge related features context related problems. results presented regarding extensive testing ""feature metalearning"" approach supervised text classification problems; demonstrated that, context, feature metalearning provide significant sometimes dramatic speedup standard feature selection heuristics.",4 "nonlinear supervised dimensionality reduction via smooth regular embeddings. recovery intrinsic geometric structures data collections important problem data analysis. supervised extensions several manifold learning approaches proposed recent years. meanwhile, existing methods primarily focus embedding training data, generalization embedding initially unseen test data rather ignored. work, build recent theoretical results generalization performance supervised manifold learning algorithms. motivated performance bounds, propose supervised manifold learning method computes nonlinear embedding constructing smooth regular interpolation function extends embedding whole data space order achieve satisfactory generalization. embedding interpolator jointly learnt lipschitz regularity interpolator imposed ensuring separation different classes. experimental results several image data sets show proposed method yields quite satisfactory performance comparison supervised dimensionality reduction algorithms traditional classifiers.",4 "regularization methods learning incomplete matrices. use convex relaxation techniques provide sequence solutions matrix completion problem. using nuclear norm regularizer, provide simple efficient algorithms minimizing reconstruction error subject bound nuclear norm. algorithm iteratively replaces missing elements obtained thresholded svd. warm starts allows us efficiently compute entire regularization path solutions.",19 "multimodal latent variable analysis. consider set multiple, multimodal sensors capturing complex system physical phenomenon interest. primary goal distinguish underlying sources variability manifested measured data. first step analysis find common source variability present sensor measurements. base work recent paper, tackles problem alternating diffusion (ad). work, suggest analysis extracting sensor-specific variables addition common source. propose algorithm, analyze theoretically, demonstrate three different applications: synthetic example, toy problem, task fetal ecg extraction.",4 "topic supervised non-negative matrix factorization. topic models extensively used organize interpret contents large, unstructured corpora text documents. although topic models often perform well traditional training vs. test set evaluations, often case results topic model align human interpretation. interpretability fallacy largely due unsupervised nature topic models, prohibits user guidance results model. paper, introduce semi-supervised method called topic supervised non-negative matrix factorization (ts-nmf) enables user provide labeled example documents promote discovery meaningful semantic structure corpus. way, results ts-nmf better match intuition desired labeling user. core ts-nmf relies solving non-convex optimization problem derive iterative algorithm shown monotonic convergent local optimum. demonstrate practical utility ts-nmf reuters pubmed corpora, find ts-nmf especially useful conceptual broad topics, topic key terms well understood. although identifying optimal latent structure data primary objective proposed approach, find ts-nmf achieves higher weighted jaccard similarity scores contemporary methods, (unsupervised) nmf latent dirichlet allocation, supervision rates low 10% 20%.",4 "bayesian online changepoint detection. changepoints abrupt variations generative parameters data sequence. online detection changepoints useful modelling prediction time series application areas finance, biometrics, robotics. frequentist methods yielded online filtering prediction techniques, bayesian papers focused retrospective segmentation problem. examine case model parameters changepoint independent derive online algorithm exact inference recent changepoint. compute probability distribution length current ``run,'' time since last changepoint, using simple message-passing algorithm. implementation highly modular algorithm may applied variety types data. illustrate modularity demonstrating algorithm three different real-world data sets.",19 "survey estimation mutual information methods measure dependency versus correlation analysis. survey, present compare different approaches estimate mutual information (mi) data analyse general dependencies variables interest system. demonstrate performance difference mi versus correlation analysis, optimal case linear dependencies. first, use piece-wise constant bayesian methodology using general dirichlet prior. estimation method, use two-stage approach approximate probability distribution first calculate marginal joint entropies. here, demonstrate performance bayesian approach versus others computing dependency different variables. also compare linear correlation analysis. finally, apply mi correlation analysis identification bias determination aerosol optical depth (aod) satellite based moderate resolution imaging spectroradiometer (modis) ground based aerosol robotic network (aeronet). here, observe aod measurements two instruments might different location. reason bias explored quantifying dependencies bias 15 variables including cloud cover, surface reflectivity others.",19 "regional active contours based variational level sets machine learning image segmentation. image segmentation problem partitioning image different subsets, subset may different characterization terms color, intensity, texture, and/or features. segmentation fundamental component image processing, plays significant role computer vision, object recognition, object tracking. active contour models (acms) constitute powerful energy-based minimization framework image segmentation, relies concept contour evolution. starting initial guess, contour evolved aim approximating better better actual object boundary. handling complex images efficient, effective, robust way real challenge, especially presence intensity inhomogeneity, overlap foreground/background intensity distributions, objects characterized many different intensities, and/or additive noise. thesis, deal challenges, propose number image segmentation models relying variational level set methods specific kinds neural networks, handle complex images supervised unsupervised ways. experimental results demonstrate high accuracy segmentation results, obtained proposed models various benchmark synthetic real images compared state-of-the-art active contour models.",4 "promise peril human evaluation model interpretability. transparency, user trust, human comprehension popular ethical motivations interpretable machine learning. support goals, researchers evaluate model explanation performance using humans real world applications. alone presents challenge many areas artificial intelligence. position paper, propose distinction descriptive persuasive explanations. discuss reasoning suggesting functional interpretability may correlated cognitive function user preferences. indeed case, evaluation optimization using functional metrics could perpetuate implicit cognitive bias explanations threaten transparency. finally, propose two potential research directions disambiguate cognitive function explanation models, retaining control tradeoff accuracy interpretability.",4 "assisting composition email responses: topic prediction approach. propose approach helping agents compose email replies customer requests. enable that, use lda extract latent topics collection email exchanges. use latent topics label data, obtaining so-called ""silver standard"" topic labelling. exploit labelled set train classifier to: (i) predict topic distribution entire agent's email response, based features customer's email; (ii) predict topic distribution next sentence agent's reply, based customer's email features features agent's current sentence. experimental results large email collection contact center tele- com domain show proposed ap- proach effective predicting best topic agent's next sentence. 80% cases, correct topic present among top five recommended topics (out fifty possible ones). shows potential method applied interactive setting, agent presented small list likely topics choose next sentence.",4 "high-order graph convolutional recurrent neural network: deep learning framework network-scale traffic learning forecasting. traffic forecasting challenging task, due complicated spatial dependencies roadway networks time-varying traffic patterns. address challenge, learn traffic network graph propose novel deep learning framework, high-order graph convolutional long short-term memory neural network (hgc-lstm), learn interactions links traffic network forecast network-wide traffic state. define high-order traffic graph convolution based physical network topology. proposed framework employs l1-norms graph convolution weights l2-norms graph convolution features identify influential links traffic network. propose novel real-time branching learning (rtbl) algorithm hgc-lstm framework accelerate training process spatio-temporal data. experiments show hgc-lstm network able capture complex spatio-temporal dependencies efficiently present traffic network consistently outperforms state-of-the-art baseline methods two heterogeneous real-world traffic datasets. visualization graph convolution weights shows proposed framework accurately recognize influential roadway segments real-world traffic networks.",4 "deep q-learning agent l-game variable batch training. employ deep q-learning algorithm experience replay train agent capable achieving high-level play l-game self-learning low-dimensional states. also employ variable batch size training order mitigate loss rare reward signal significantly accelerate training. despite large action space due number possible moves, low-dimensional state space rarity rewards, come end game, dql successful training agent capable strong play without use search methods domain knowledge.",4 "modelling legal contracts processes. paper concentrates representation legal relations obtain parties entered contractual agreement evolution agreement progresses time. contracts regarded process analysed terms obligations active various points life span. informal notation introduced summarizes conveniently states agreement evolves time. representation enables us determine status agreement is, given event sequence events concern performance actions agents involved. useful context contract drafting (where parties might wish preview agreement might evolve) context contract performance monitoring (where parties might establish legal positions agreement force). discussion based example illustrates typical patterns contractual obligations.",4 "towards automation data quality system cern cms experiment. daily operation large-scale experiment challenging task, particularly perspectives routine monitoring quality data taken. describe approach uses machine learning automated system monitor data quality, based partial use data qualified manually detector experts. system automatically classifies marginal cases: good bad data, use human expert decision classify remaining ""grey area"" cases. study uses collision data collected cms experiment lhc 2010. demonstrate proposed workflow able automatically process least 20\% samples without noticeable degradation result.",15 "multi-vehicle covering tour problem: building routes urban patrolling. paper study particular aspect urban community policing: routine patrol route planning. seek routes guarantee visibility, sizable impact community perceived safety, allowing quick emergency responses providing surveillance selected sites (e.g., hospitals, schools). planning restricted availability vehicles strives achieve balanced routes. study adaptation model multi-vehicle covering tour problem, set locations must visited, whereas another subset must close enough planned routes. constitutes np-complete integer programming problem. suboptimal solutions obtained several heuristics, adapted literature others developed us. solve adapted instances tsplib instance real data, former compared results literature, latter compared empirical data.",4 "learning depth monocular videos using direct methods. ability predict depth single image - using recent advances cnns - increasing interest vision community. unsupervised strategies learning particularly appealing utilize much larger varied monocular video datasets learning without need ground truth depth stereo. previous works, separate pose depth cnn predictors determined joint outputs minimized photometric error. inspired recent advances direct visual odometry (dvo), argue depth cnn predictor learned without pose cnn predictor. further, demonstrate empirically incorporation differentiable implementation dvo, along novel depth normalization strategy - substantially improves performance state art use monocular videos training.",4 regular expressions decoding neural network outputs. article proposes convenient tool decoding output neural networks trained connectionist temporal classification (ctc) handwritten text recognition. use regular expressions describe complex structures expected writing. corresponding finite automata employed build decoder. analyze theoretically calculations relevant avoided. great speed-up results approximation. conclude approximation likely fails regular expression match ground truth harmful many applications since low probability even underestimated. proposed decoder efficient compared decoding methods. variety applications reaches information retrieval full text recognition. refer applications integrated proposed decoder successfully.,4 "swap swap? exploiting dependency word pairs reordering statistical machine translation. reordering poses major challenge machine translation (mt) two languages significant differences word order. paper, present novel reordering approach utilizing sparse features based dependency word pairs. instance features captures whether two words, related dependency link source sentence dependency parse tree, follow order swapped translation output. experiments chinese-to-english translation show statistically significant improvement 1.21 bleu point using approach, compared state-of-the-art statistical mt system incorporates prior reordering approaches.",4 "supervised learning sparse context reconstruction coefficients data representation classification. context data points, usually defined data points data set, found play important roles data representation classification. paper, study problem using context data point classification problem. work inspired observation actually data points critical context data point representation classification. propose represent data point sparse linear combination context, learn sparse context supervised way increase discriminative ability. end, proposed novel formulation context learning, modeling learning context parameter classifier unified objective, optimizing alternative strategy iterative algorithm. experiments three benchmark data set show advantage state-of-the-art context-based data representation classification methods.",4 "subspace alignment domain adaptation. paper, introduce new domain adaptation (da) algorithm source target domains represented subspaces spanned eigenvectors. method seeks domain invariant feature space learning mapping function aligns source subspace target one. show solution corresponding optimization problem obtained simple closed form, leading extremely fast algorithm. present two approaches determine hyper-parameter method corresponding size subspaces. first approach tune size subspaces using theoretical bound stability obtained result. second approach, use maximum likelihood estimation determine subspace size, particularly useful high dimensional data. apart pca, propose subspace creation method outperform partial least squares (pls) linear discriminant analysis (lda) domain adaptation. test method various datasets show that, despite intrinsic simplicity, outperforms state art da methods.",4 "separators adjustment sets causal graphs: complete criteria algorithmic framework. principled reasoning identifiability causal effects non-experimental data important application graphical causal models. present algorithmic framework efficiently testing, constructing, enumerating $m$-separators ancestral graphs (ags), class graphical causal models represent uncertainty presence latent confounders. furthermore, prove reduction causal effect identification covariate adjustment $m$-separation subgraph directed acyclic graphs (dags) maximal ancestral graphs (mags). jointly, results yield constructive criteria characterize adjustment sets well minimal minimum adjustment sets identification desired causal effect multivariate exposures outcomes presence latent confounding. results extend several existing solutions special cases problems. efficient algorithms allowed us empirically quantify identifiability gap covariate adjustment do-calculus random dags, covering wide range scenarios. implementations algorithms provided r package dagitty.",4 "flexible interpretations: computational model dynamic uncertainty assessment. investigations reported paper center process dynamic uncertainty assessment interpretation tasks real domain. particular, interested nature control structure computer programs support multiple interpretation smooth transitions them, real time. step processing involves interpretation one input item appropriate re-establishment system's confidence correctness interpretation(s).",4 "stargan: unified generative adversarial networks multi-domain image-to-image translation. recent studies shown remarkable success image-to-image translation two domains. however, existing approaches limited scalability robustness handling two domains, since different models built independently every pair image domains. address limitation, propose stargan, novel scalable approach perform image-to-image translations multiple domains using single model. unified model architecture stargan allows simultaneous training multiple datasets different domains within single network. leads stargan's superior quality translated images compared existing models well novel capability flexibly translating input image desired target domain. empirically demonstrate effectiveness approach facial attribute transfer facial expression synthesis tasks.",4 "video object segmentation re-identification. conventional video segmentation methods often rely temporal continuity propagate masks. assumption suffers issues like drifting inability handle large displacement. overcome issues, formulate effective mechanism prevent target lost via adaptive object re-identification. specifically, video object segmentation re-identification (vs-reid) model includes mask propagation module reid module. former module produces initial probability map flow warping latter module retrieves missing instances adaptive matching. two modules iteratively applied, vs-reid records global mean (region jaccard boundary f measure) 0.699, best performance 2017 davis challenge.",4 "toward statistical mechanics four letter words. consider words network interacting letters, approximate probability distribution states taken network. despite intuition rules english spelling highly combinatorial (and arbitrary), find maximum entropy models consistent pairwise correlations among letters provide surprisingly good approximation full statistics four letter words, capturing ~92% multi-information among letters even ""discovering"" real words represented data pairwise correlations estimated. maximum entropy model defines energy landscape space possible words, local minima landscape account nearly two-thirds words used written english.",16 "flag n' flare: fast linearly-coupled adaptive gradient methods. consider first order gradient methods effectively optimizing composite objective form sum smooth and, potentially, non-smooth functions. present accelerated adaptive gradient methods, called flag flare, offer best worlds. achieve optimal convergence rate attaining optimal first-order oracle complexity smooth convex optimization. additionally, adaptively non-uniformly re-scale gradient direction adapt limited curvature available conform geometry domain. show theoretically empirically that, compounding effects acceleration adaptivity, flag flare highly effective many data fitting machine learning applications.",12 "least-squares fir models low-resolution mr data efficient phase-error compensation simultaneous artefact removal. signal space models phase-encode, frequency-encode directions presented extrapolation 2d partial kspace. using boxcar representation low-resolution spatial data, geometrical representation signal space vectors positive negative phase-encode directions, robust predictor constructed using series signal space projections. compared existing phase-correction methods require acquisition pre-determined set fractional kspace lines, proposed predictor found efficient, due capability exhibiting equivalent degree performance using half number fractional lines. robust filtering noisy data achieved using second signal space model frequency-encode direction, bypassing requirement prior highpass filtering operation. signal space constructed fourier transformed samples row low-resolution image. set fir filters estimated fitting least squares model signal space. partial kspace extrapolation using fir filters shown result artifact-free reconstruction, particularly respect gibbs ringing streaking type artifacts.",4 "process monitoring sequences system call count vectors. introduce methodology efficient monitoring processes running hosts corporate network. methodology based collecting streams system calls produced selected processes hosts, sending network monitoring server, machine learning algorithms used identify changes process behavior due malicious activity, hardware failures, software errors. methodology uses sequence system call count vectors data format handle large varying volumes data. unlike previous approaches, methodology introduced paper suitable distributed collection processing data large corporate networks. evaluate methodology laboratory setting real-life setup provide statistics characterizing performance accuracy methodology.",4 "egocentric pose recognition four lines code. tackle problem estimating 3d pose individual's upper limbs (arms+hands) chest mounted depth-camera. importantly, consider pose estimation everyday interactions objects. past work shows strong pose+viewpoint priors depth-based features crucial robust performance. egocentric views, hands arms observable within well defined volume front camera. call volume egocentric workspace. notable property hand appearance correlates workspace location. exploit correlation, classify arm+hand configurations global egocentric coordinate frame, rather local scanning window. greatly simplify architecture improves performance. propose efficient pipeline 1) generates synthetic workspace exemplars training using virtual chest-mounted camera whose intrinsic parameters match physical camera, 2) computes perspective-aware depth features entire volume 3) recognizes discrete arm+hand pose classes sparse multi-class svm. method provides state-of-the-art hand pose recognition performance egocentric rgb-d images real-time.",4 "fast amortized inference learning log-linear models randomly perturbed nearest neighbor search. inference log-linear models scales linearly size output space worst-case. often bottleneck natural language processing computer vision tasks output space feasibly enumerable large. propose method perform inference log-linear models sublinear amortized cost. idea hinges using gumbel random variable perturbations pre-computed maximum inner product search data structure access most-likely elements sublinear amortized time. method yields provable runtime accuracy guarantees. further, present empirical experiments imagenet word embeddings showing significant speedups sampling, inference, learning log-linear models.",4 "application s-transform hyper kurtosis based modified duo histogram equalized dic images pre-cancer detection. proposed hyper kurtosis based histogram equalized dic images enhances contrast preserving brightness. evolution development precancerous activity among tissues studied s-transform (st). significant variations amplitude spectra observed due increased medium roughness normal tissue observed time-frequency domain. randomness inhomogeneity tissue structures among human normal different grades dic tissues recognized st based timefrequency analysis. study offers simpler better way recognize substantial changes among different stages dic tissues, reflected spatial information containing within inhomogeneity structures different types tissue.",4 "applying fuzzy id3 decision tree software effort estimation. web effort estimation process predicting efforts cost terms money, schedule staff software project system. many estimation models proposed last three decades believed must purpose of: budgeting, risk analysis, project planning control, project improvement investment analysis. paper, investigate use fuzzy id3 decision tree software cost estimation; designed integrating principles id3 decision tree fuzzy set-theoretic concepts, enabling model handle uncertain imprecise data describing software projects, improve greatly accuracy obtained estimates. mmre pred used measures prediction accuracy study. series experiments reported using two different software projects datasets namely, tukutuku cocomo'81 datasets. results compared produced crisp version id3 decision tree.",4 "cma evolution strategy: tutorial. tutorial introduces cma evolution strategy (es), cma stands covariance matrix adaptation. cma-es stochastic, randomized, method real-parameter (continuous domain) optimization non-linear, non-convex functions. try motivate derive algorithm intuitive concepts requirements non-linear, non-convex search continuous domain.",4 "dynamic vulnerability map assess risk road network traffic utilization. le havre agglomeration (codah) includes 16 establishments classified seveso high threshold. literature, construct vulnerability maps help decision makers assess risk. approaches remain static take account population displacement estimation vulnerability. propose decision making tool based dynamic vulnerability map evaluate difficulty evacuation different sectors codah. use geographic information system (gis) visualize map evolves road traffic state detection communities large graphs algorithm.",4 "minimum description length induction, bayesianism, kolmogorov complexity. relationship bayesian approach minimum description length approach established. sharpen clarify general modeling principles mdl mml, abstracted ideal mdl principle defined bayes's rule means kolmogorov complexity. basic condition ideal principle applied encapsulated fundamental inequality, broad terms states principle valid data random, relative every contemplated hypothesis also hypotheses random relative (universal) prior. basically, ideal principle states prior probability associated hypothesis given algorithmic universal probability, sum log universal probability model plus log probability data given model minimized. restrict model class finite sets application ideal principle turns kolmogorov's minimal sufficient statistic. general show data compression almost always best strategy, hypothesis identification prediction.",4 "monitoring term drift based semantic consistency evolving vector field. based aristotelian concept potentiality vs. actuality allowing study energy dynamics language, propose field approach lexical analysis. falling back distributional hypothesis statistically model word meaning, used evolving fields metaphor express time-dependent changes vector space model combination random indexing evolving self-organizing maps (esom). monitor semantic drifts within observation period, experiment carried term space collection 12.8 million amazon book reviews. evaluation, semantic consistency esom term clusters compared respective neighbourhoods wordnet, contrasted distances among term vectors random indexing. found 0.05 level significance, terms clusters showed high level semantic consistency. tracking drift distributional patterns term space across time periods, found consistency decreased, statistically significant level. method highly scalable, interpretations philosophy.",4 "neural speed reading via skim-rnn. inspired principles speed reading, introduce skim-rnn, recurrent neural network (rnn) dynamically decides update small fraction hidden state relatively unimportant input tokens. skim-rnn gives computational advantage rnn always updates entire hidden state. skim-rnn uses input output interfaces standard rnn easily used instead rnns existing models. experiments, show skim-rnn achieve significantly reduced computational cost without losing accuracy compared standard rnns across five different natural language tasks. addition, demonstrate trade-off accuracy speed skim-rnn dynamically controlled inference time stable manner. analysis also shows skim-rnn running single cpu offers lower latency compared standard rnns gpus.",4 "recurrent deep stacking networks speech recognition. paper presented work applying recurrent deep stacking networks (rdsns) robust automatic speech recognition (asr) tasks. paper, also proposed efficient yet comparable substitute rdsn, bi- pass stacking network (bpsn). main idea two models add phoneme-level information acoustic models, transforming acoustic model combination acoustic model phoneme-level n-gram model. experiments showed rdsn bpsn substantially improve performances conventional dnns.",4 "image disguise based generative model. protect image contents, existing encryption algorithms designed transform original image texture-like noise-like image, is, however, obvious visual sign indicating presence encrypted image, results significantly large number attacks. solve problem, paper, propose new image encryption method generate visually image original one sending meaning-normal independent image corresponding well-trained generative model achieve effect disguising original image. image disguise method solves problem obvious visual implication, also guarantees security information.",4 "candidates v.s. noises estimation large multi-class classification problem. paper proposes method multi-class classification problems, number classes $k$ large. method, referred {\em candidates v.s. noises estimation} (cane), selects small subset candidate classes samples remaining classes. show cane always consistent computationally efficient. moreover, resulting estimator low statistical variance approaching maximum likelihood estimator, observed label belongs selected candidates high probability. practice, use tree structure leaves classes promote fast beam search candidate selection. also apply cane method estimate word probabilities neural language models. experiments show cane achieves better prediction accuracy noise-contrastive estimation (nce), variants number state-of-the-art tree classifiers, gains significant speedup compared standard $\mathcal{o}(k)$ methods.",19 "distance function numbers. dempster-shafer theory widely applied uncertainty modelling knowledge reasoning due ability expressing uncertain information. distance two basic probability assignments(bpas) presents measure performance identification algorithms based evidential theory dempster-shafer. however, conditions lead limitations practical application dempster-shafer theory, exclusiveness hypothesis completeness constraint. overcome shortcomings, novel theory called numbers theory proposed. distance function numbers proposed measure distance two numbers. distance function numbers generalization distance two bpas, inherits advantage dempster-shafer theory strengthens capability uncertainty modeling. illustrative case provided demonstrate effectiveness proposed function.",4 "efficient effective single-document summarizations word-embedding measurement quality. task generate effective summary given document specific realtime requirements. use softplus function enhance keyword rankings favor important sentences, based present number summarization algorithms using various keyword extraction topic clustering methods. show algorithms meet realtime requirements yield best rouge recall scores duc-02 previously-known algorithms. show algorithms meet realtime requirements yield best rouge recall scores duc-02 previously-known algorithms. evaluate quality summaries without human-generated benchmarks, define measure called wesm based word-embedding using word mover's distance. show orderings rouge wesm scores algorithms highly comparable, suggesting wesm may serve viable alternative measuring quality summary.",4 "cost adaptation robust decentralized swarm behaviour. multi-agent swarm system robust paradigm drive efficient completion complex tasks even energy limitations time constraints. however, coordination swarm centralized command center difficult, particularly swarm becomes large spans wide ranges. here, leverage propagation messages based mesh-networking protocols global communication swarm online cost-optimization decentralized receding horizon control drive decentralized decision-making. cost-based formulation allows wide range tasks encoded. ensure this, implement method adaptation costs constraints ensures effectiveness novel tasks, network delays, heterogeneous flight capabilities, increasingly large swarms. use unity3d game engine build simulator capable introducing artificial networking failures delays swarm. using simulator validate method using example coordinated exploration task. release simulator code community future work.",4 "prototype knowledge-based programming environment. paper present proposal knowledge-based programming environment. environment, declarative background knowledge, procedures, concrete data represented suitable languages combined flexible manner. leads highly declarative programming style. illustrate approach example report prototype implementation.",4 "approximate muscle guided beam search three-index assignment problem. well-known np-hard problem, three-index assignment problem (ap3) attracted lots research efforts developing heuristics. however, existing heuristics either obtain less competitive solutions consume much time. paper, new heuristic named approximate muscle guided beam search (ambs) developed achieve good trade-off solution quality running time. combining approximate muscle beam search, solution space size significantly decreased, thus time searching solution sharply reduced. extensive experimental results benchmark indicate new algorithm able obtain solutions competitive quality employed instances largescale. work paper proposes new efficient heuristic, also provides promising method improve efficiency beam search.",4 "doctag2vec: embedding based multi-label learning approach document tagging. tagging news articles blog posts relevant tags collection predefined ones coined document tagging work. accurate tagging articles benefit several downstream applications recommendation search. work, propose novel yet simple approach called doctag2vec accomplish task. substantially extend word2vec doc2vec---two popular models learning distributed representation words documents. doctag2vec, simultaneously learn representation words, documents, tags joint vector space training, employ simple $k$-nearest neighbor search predict tags unseen documents. contrast previous multi-label learning methods, doctag2vec directly deals raw text instead provided feature vector, addition, enjoys advantages like learning tag representation, ability handling newly created tags. demonstrate effectiveness approach, conduct experiments several datasets show promising results state-of-the-art methods.",4 "proposing lt based search pdm systems better information retrieval. pdm systems contain manage heavy amount data search mechanism systems intelligent process user""s natural language based queries extract desired information. currently available search mechanisms almost pdm systems efficient based old ways searching information entering relevant information respective fields search forms find specific information attached repositories. targeting issue, thorough research conducted fields pdm systems language technology. concerning pdm system, conducted research provides information pdm pdm systems detail. concerning field language technology, helps implementing search mechanism pdm systems search user""s needed information analyzing user""s natural language based requests. accomplished goal research support field pdm new proposition conceptual model implementation natural language based search. proposed conceptual model successfully designed partially implementation form prototype. describing proposition detail main concept, implementation designs developed prototype proposed approach discussed paper. implemented prototype compared respective functions existing pdm systems .i.e., windchill cim evaluate effectiveness targeted challenges.",4 "distributed air traffic control : human safety perspective. issues air traffic control far addressed intent improve resource utilization achieve optimized solution respect fuel comsumption aircrafts, efficient usage available airspace minimal congestion related losses various dynamic constraints. focus almost always smarter management traffic increase profits human safety, though achieved process, believe, remained less seriously attended. become important given overburdened overstressed air traffic controllers managing hundreds airports thousands aircrafts per day. propose multiagent system based distributed approach handle air traffic ensuring complete human (passenger) safety without removing humans (ground controllers) loop thereby also retaining earlier advantages new solution. detailed design agent system, easily interfacable existing environment, described. based initial findings simulations, strongly believe system capable handling nuances involved, extendable customizable later point time.",4 "representation texts complex networks: mesoscopic approach. statistical techniques analyze texts, referred text analytics, departed use simple word count statistics towards new paradigm. text mining hinges sophisticated set methods, including representations terms complex networks. well-established word-adjacency (co-occurrence) methods successfully grasp syntactical features written texts, unable represent important aspects textual data, topical structure, i.e. sequence subjects developing mesoscopic level along text. aspects often overlooked current methodologies. order grasp mesoscopic characteristics semantical content written texts, devised network model able analyze documents multi-scale fashion. proposed model, limited amount adjacent paragraphs represented nodes, connected whenever share minimum semantical content. illustrate capabilities model, present, case example, qualitative analysis ""alice's adventures wonderland"". show mesoscopic structure document, modeled network, reveals many semantic traits texts. approach paves way myriad semantic-based applications. addition, approach illustrated machine learning context, texts classified among real texts randomized instances.",4 "teaching machines code: neural markup generation visual attention. present deep recurrent neural network model soft visual attention learns generate latex markup real-world math formulas given images. applying neural sequence generation techniques successful fields machine translation image/handwriting/speech captioning, recognition, transcription synthesis, construct image-to-markup model learns produce syntactically semantically correct latex markup code 150 words long achieves bleu score 89%; best reported far im2latex problem. also visually demonstrate model learns scan image left-right / up-down much human would read it.",4 "statistical keyword detection literary corpora. understanding complexity human language requires appropriate analysis statistical distribution words texts. consider information retrieval problem detecting ranking relevant words text means statistical information referring ""spatial"" use words. shannon's entropy information used tool automatic keyword extraction. using origin species charles darwin representative text sample, show performance detector compare another proposals literature. random shuffled text receives special attention tool calibrating ranking indices.",4 "representation embedding knowledge bases beyond binary relations. models developed date knowledge base embedding based assumption relations contained knowledge bases binary. training testing embedding models, multi-fold (or n-ary) relational data converted triples (e.g., fb15k dataset) interpreted instances binary relations. paper presents canonical representation knowledge bases containing multi-fold relations. show existing embedding models popular fb15k datasets correspond sub-optimal modelling framework, resulting loss structural information. advocate novel modelling framework, models multi-fold relations directly using canonical representation. using framework, existing transh model generalized new model, m-transh. demonstrate experimentally m-transh outperforms transh large margin, thereby establishing new state art.",4 "affact - alignment-free facial attribute classification technique. facial attributes soft-biometrics allow limiting search space, e.g., rejecting identities non-matching facial characteristics nose sizes eyebrow shapes. paper, investigate latest versions deep convolutional neural networks, resnets, perform facial attribute classification task. test two loss functions: sigmoid cross-entropy loss euclidean loss, find classification performance little difference two. using ensemble three resnets, obtain new state-of-the-art facial attribute classification error 8.00% aligned images celeba dataset. significantly, introduce alignment-free facial attribute classification technique (affact), data augmentation technique allows network classify facial attributes without requiring alignment beyond detected face bounding boxes. best knowledge, first report similar accuracy using detected bounding boxes -- rather requiring alignment based automatically detected facial landmarks -- improve classification accuracy rotating scaling test images. show approach outperforms celeba baseline unaligned images relative improvement 36.8%.",4 "interpretation mammogram chest x-ray reports using deep neural networks - preliminary results. radiology reports important means communication radiologists physicians. reports express radiologist's interpretation medical imaging examination critical establishing diagnosis formulating treatment plan. paper, propose bi-directional convolutional neural network (bi-cnn) model interpretation classification mammograms based breast density chest radiographic radiology reports based basis chest pathology. proposed approach helps organize databases radiology reports, retrieve expeditiously, evaluate radiology report could used auditing system decrease incorrect diagnoses. study revealed proposed bi-cnn outperforms random forest support vector machine methods.",4 "saliency benchmarking: separating models, maps metrics. field fixation prediction heavily model-driven, dozens new models published every year. however, progress field difficult judge models compared using variety inconsistent metrics. soon saliency map optimized certain metric, penalized metrics. propose principled approach solve benchmarking problem: separate notions saliency models saliency maps. define saliency model probabilistic model fixation density prediction and, inspired bayesian decision theory, saliency map metric-specific prediction derived model density maximizes expected performance metric. derive optimal saliency map commonly used saliency metrics (auc, sauc, nss, cc, sim, kl-div) show computed analytically approximated high precision using model density. show leads consistent rankings metrics avoids penalties using one saliency map metrics. framework, ""good"" models perform well metrics.",4 "selecting best player formation corner-kick situations based bayes' estimation. domain soccer simulation 2d league robocup project, appropriate player positioning given opponent team important factor soccer team performance. work proposes model decides strategy applied regarding particular opponent team. task realized applying preliminary learning phase model determines effective strategies clusters opponent teams. model determines best strategies using sequential bayes' estimators. first trial system, proposed model used determine association player formations opponent teams particular situation corner-kick. implemented model shows satisfying abilities compare player formations similar terms performance determines right ranking even running decent number simulation games.",4 "morally acceptable system lie persuade me?. given fast rise increasingly autonomous artificial agents robots, key acceptability criterion possible moral implications actions. particular, intelligent persuasive systems (systems designed influence humans via communication) constitute highly sensitive topic intrinsically social nature. still, ethical studies area rare tend focus output required action. instead, work focuses persuasive acts (e.g. ""is morally acceptable machine lies appeals emotions person persuade her, even good end?""). exploiting behavioral approach, based human assessment moral dilemmas -- i.e. without prior assumption underlying ethical theories -- paper reports set experiments. experiments address type persuader (human machine), strategies adopted (purely argumentative, appeal positive emotions, appeal negative emotions, lie) circumstances. findings display differences due agent, mild acceptability persuasion reveal truth-conditional reasoning (i.e. argument validity) significant dimension affecting subjects' judgment. implications design intelligent persuasive systems discussed.",4 "large-scale domain adaptation via teacher-student learning. high accuracy speech recognition requires large amount transcribed data supervised training. absence data, domain adaptation well-trained acoustic model performed, even here, high accuracy usually requires significant labeled data target domain. work, propose approach domain adaptation require transcriptions instead uses corpus unlabeled parallel data, consisting pairs samples source domain well-trained model desired target domain. perform adaptation, employ teacher/student (t/s) learning, posterior probabilities generated source-domain model used lieu labels train target-domain model. evaluate proposed approach two scenarios, adapting clean acoustic model noisy speech adapting adults speech acoustic model children speech. significant improvements accuracy obtained, reductions word error rate 44% original source model without need transcribed data target domain. moreover, show increasing amount unlabeled data results additional model robustness, particularly beneficial using simulated training data target-domain.",4 "temporal human action segmentation via dynamic clustering. present effective dynamic clustering algorithm task temporal human action segmentation, comprehensive applications robotics, motion analysis, patient monitoring. proposed algorithm unsupervised, fast, generic process various types features, applicable online offline settings. perform extensive experiments processing data streams, show algorithm achieves state-of-the-art results online offline settings.",4 "indefinite kernel logistic regression. traditionally, kernel learning methods requires positive definitiveness kernel, strict excludes many sophisticated similarities, indefinite, multimedia area. utilize indefinite kernels, indefinite learning methods great interests. paper aims extension logistic regression positive semi-definite kernels indefinite kernels. model, called indefinite kernel logistic regression (iklr), keeps consistency regular klr formulation essentially becomes non-convex. thanks positive decomposition indefinite matrix, iklr transformed difference two convex models, follows use concave-convex procedure. moreover, employ inexact solving scheme speed sub-problem develop concave-inexact-convex procedure (ccicp) algorithm theoretical convergence analysis. systematical experiments multi-modal datasets demonstrate superiority proposed iklr method kernel logistic regression positive definite kernels state-of-the-art indefinite learning based algorithms.",4 "well-typed lightweight situation calculus. situation calculus widely applied artificial intelligence related fields. formalism considered dialect logic programming language mostly used dynamic domain modeling. however, type systems hardly deployed situation calculus literature. achieve correct sound typed program written situation calculus, adding typing elements current situation calculus quite helpful. paper, propose add typing mechanisms current version situation calculus, especially three basic elements situation calculus: situations, actions objects, perform rigid type checking existing situation calculus programs find well-typed ill-typed ones. way, type correctness soundness situation calculus programs guaranteed type checking based type system. modified version lightweight situation calculus proved robust well-typed system.",4 "online adaptive pseudoinverse solutions elm weights. elm method become widely used classification regressions problems result accuracy, simplicity ease use. solution hidden layer weights means matrix pseudoinverse operation significant contributor utility method; however, conventional calculation pseudoinverse means singular value decomposition (svd) always practical large data sets online updates solution. paper discuss incremental methods solving pseudoinverse suitable elm. show careful choice methods allows us optimize accuracy, ease computation, adaptability solution.",4 "constraint propagation first-order logic inductive definitions. constraint propagation one basic forms inference many logic-based reasoning systems. paper, investigate constraint propagation first-order logic (fo), suitable language express wide variety constraints. present algorithm polynomial-time data complexity constraint propagation context fo theory finite structure. show constraint propagation manner represented datalog program algorithm executed symbolically, i.e., independently structure. next, extend algorithm fo(id), extension fo inductive definitions. finally, discuss several applications.",4 "comparative studies decentralized multiloop pid controller design using evolutionary algorithms. decentralized pid controllers designed paper simultaneous tracking individual process variables multivariable systems step reference input. controller design framework takes account minimization weighted sum integral time multiplied squared error (itse) integral squared controller output (isco) balance overall tracking errors process variables required variation corresponding manipulated variables. decentralized pid gains tuned using three popular evolutionary algorithms (eas) viz. genetic algorithm (ga), evolutionary strategy (es) cultural algorithm (ca). credible simulation comparisons reported four benchmark 2x2 multivariable processes.",4 "multiresolution hierarchical analysis astronomical spectroscopic cubes using 3d discrete wavelet transform. intrinsically hierarchical blended structure interstellar molecular clouds, plus always increasing resolution astronomical instruments, demand advanced automated pattern recognition techniques identifying connecting source components spectroscopic cubes. extend work done multiresolution analysis using wavelets astronomical 2d images 3d spectroscopic cubes, combining results dendrograms approach offer hierarchical representation connections sources different scale levels. test approach real data alma observatory, exploring different wavelet families assessing main parameter source identification (i.e., rms) level. approach shows feasible perform multiresolution analysis spatial frequency domains simultaneously rather analyzing spectral channel independently.",4 "unified approach error bounds structured convex optimization problems. error bounds, refer inequalities bound distance vectors test set given set residual function, proven extremely useful analyzing convergence rates host iterative methods solving optimization problems. paper, present new framework establishing error bounds class structured convex optimization problems, objective function sum smooth convex function general closed proper convex function. class encapsulates fairly general constrained minimization problems also various regularized loss minimization formulations machine learning, signal processing, statistics. using framework, show number existing error bound results recovered unified transparent manner. demonstrate power framework, apply class nuclear-norm regularized loss minimization problems establish new error bound class strict complementarity-type regularity condition. complement result constructing example show said error bound could fail hold without regularity condition. consequently, obtain rather complete answer question raised tseng. believe approach find applications study error bounds structured convex optimization problems.",12 "face segmentation, face swapping, face perception. show even face images unconstrained arbitrarily paired, face swapping actually quite simple. end, make following contributions. (a) instead tailoring systems face segmentation, others previously proposed, show standard fully convolutional network (fcn) achieve remarkably fast accurate segmentations, provided trained rich enough example set. purpose, describe novel data collection generation routines provide challenging segmented face examples. (b) use segmentations enable robust face swapping unprecedented conditions. (c) unlike previous work, swapping robust enough allow extensive quantitative tests. end, use labeled faces wild (lfw) benchmark measure effect intra- inter-subject face swapping recognition. show intra-subject swapped faces remain recognizable sources, testifying effectiveness method. line well known perceptual studies, show better face swapping produces less recognizable inter-subject results. first time effect quantitatively demonstrated machine vision systems.",4 "homomorphic signal processing deep neural networks: constructing deep algorithms polyphonic music transcription. paper presents new approach understanding deep neural networks (dnns) work applying homomorphic signal processing techniques. focusing task multi-pitch estimation (mpe), paper demonstrates equivalence relation generalized cepstrum dnn terms structures functionality. equivalence relation, together pitch perception theories recently established rectified-correlations-on-a-sphere (recos) filter analysis, provide alternative way explaining role nonlinear activation function multi-layer structure, exist cepstrum dnn. validate efficacy new approach, new feature designed fashion proposed pitch salience function. new feature outperforms one-layer spectrum mpe task and, predicted, addresses issue missing fundamental effect also achieves better robustness noise.",4 "graph partitioning via parallel submodular approximation accelerate distributed machine learning. distributed computing excels processing large scale data, communication cost synchronizing shared parameters may slow overall performance. fortunately, interactions parameter data many problems sparse, admits efficient partition order reduce communication overhead. paper, formulate data placement graph partitioning problem. propose distributed partitioning algorithm. give theoretical guarantees highly efficient implementation. also provide highly efficient implementation algorithm demonstrate promising results text datasets social networks. show proposed algorithm leads 1.6x speedup state-of-the-start distributed machine learning system eliminating 90\% network communication.",4 "fast eigenspace approximation using random signals. focus work estimation first $k$ eigenvectors graph laplacian using filtering gaussian random signals. prove need $k$ signals able exactly recover many smallest eigenvectors, regardless number nodes graph. addition, address key issues implementing theoretical concepts practice using accurate approximated methods. also propose fast algorithms eigenspace approximation determination $k$th smallest eigenvalue $\lambda_k$. latter proves extremely efficient assumption locally uniform distribution eigenvalue spectrum. finally, present experiments show validity method practice compare state-of-the-art methods clustering visualization synthetic small-scale datasets larger real-world problems millions nodes. show method allows better scaling number nodes previous methods achieving almost perfect reconstruction eigenspace formed first $k$ eigenvectors.",4 "linear shift invariant multiscale transform. paper presents multiscale decomposition algorithm. unlike standard wavelet transforms, proposed operator linear shift invariant. central idea obtain shift invariance averaging aligned wavelet transform projections circular shifts signal. shown transform obtained linear filter bank.",4 "parallel corpus translationese. describe set bilingual english--french english--german parallel corpora direction translation accurately reliably annotated. corpora diverse, consisting parliamentary proceedings, literary works, transcriptions ted talks political commentary. instrumental research translationese applications (human machine) translation; specifically, used task translationese identification, research direction enjoys growing interest recent years. validate quality reliability corpora, replicated previous results supervised unsupervised identification translationese, extended experiments additional datasets languages.",4 "lifted region-based belief propagation. due intractable nature exact lifted inference, research recently focused discovery accurate efficient approximate inference algorithms statistical relational models (srms), lifted first-order belief propagation. fobp simulates propositional factor graph belief propagation without constructing ground factor graph identifying lifting redundant message computations. work, propose generalization fobp called lifted generalized belief propagation, region structure message structure lifted. approach allows inference performed intra-region (in exact inference step bp), thereby allowing simulation propagation graph structure larger region scopes fewer edges, still maintaining tractability. demonstrate resulting algorithm converges fewer iterations accurate results variety srms.",4 "pediatric bone age assessment using deep convolutional neural networks. skeletal bone age assessment common clinical practice diagnose endocrine metabolic disorders child development. paper, describe fully automated deep learning approach problem bone age assessment using data pediatric bone age challenge organized rsna 2017. dataset competition consisted 12.6k radiological images left hand labeled bone age sex patients. approach utilizes several deep learning architectures: u-net, resnet-50, custom vgg-style neural networks trained end-to-end. use images whole hands well specific parts hand training inference. approach allows us measure importance specific hand bones automated bone age analysis. evaluate performance method context skeletal development stages. approach outperforms common methods bone age assessment.",4 parameterized complexity results symmetry breaking. symmetry common feature many combinatorial problems. unfortunately eliminating symmetry problem often computationally intractable. paper argues recent parameterized complexity results provide insight intractability help identify special cases symmetry dealt tractably,4 "polyglot semantic parsing apis. traditional approaches semantic parsing (sp) work training individual models available parallel dataset text-meaning pairs. paper, explore idea polyglot semantic translation, learning semantic parsing models trained multiple datasets natural languages. particular, focus translating text code signature representations using software component datasets richardson kuhn (2017a,b). advantage models used parsing wide variety input natural languages output programming languages, mixed input languages, using single unified model. facilitate modeling type, develop novel graph-based decoding framework achieves state-of-the-art performance datasets, apply method two benchmark sp tasks.",4 "object recognition imperfect perception redundant description. paper deals scene recognition system robotics contex. general problem match images a priori descriptions. typical mission would consist identifying object installation vision system situated end manipulator human operator provided description, formulated pseudo-natural language, possibly redundant. originality work comes nature description, special attention given management imprecision uncertainty interpretation process way assess description redundancy reinforce overall matching likelihood.",4 "device? - detecting smart device's wearing location context active safety vulnerable road users. article describes approach detect wearing location smart devices worn pedestrians cyclists. detection, based solely sensors smart devices, important context-information used parametrize subsequent algorithms, e.g. dead reckoning intention detection improve safety vulnerable road users. wearing location recognition terms organic computing (oc) seen step towards self-awareness self-adaptation. wearing location detection two-stage process presented. subdivided moving detection followed wearing location classification. finally, approach evaluated real world dataset consisting pedestrians cyclists.",4 "stability phase retrievable frames. paper study property phase retrievability redundant sysems vectors perturbations frame set. specifically show set $\fc$ $m$ vectors complex hilbert space dimension n allows vector reconstruction magnitudes coefficients, perturbation bound $\rho$ frame set within $\rho$ $\fc$ property. particular proves recent construction \cite{bh13} stable perturbations. token reduce critical cardinality conjectured \cite{bcmn13a} proving stability result non phase-retrievable frames.",12 "global preferential consistency topological sorting-based maximal spanning tree problem. introduce new type fully computable problems, dss dedicated maximal spanning tree problems, based deduction choice: preferential consistency problems. show interest, describe new compact representation preferences specific spanning trees, identifying efficient maximal spanning tree sub-problem. next, compare problem pareto-based multiobjective one. last, propose efficient algorithm solving associated preferential consistency problem.",4 "non-sparse linear representations visual tracking online reservoir metric learning. sparse linear representation-based trackers need solve computationally expensive l1-regularized optimization problem. address problem, propose visual tracker based non-sparse linear representations, admit efficient closed-form solution without sacrificing accuracy. moreover, order capture correlation information different feature dimensions, learn mahalanobis distance metric online fashion incorporate learned metric optimization problem obtaining linear representation. show online metric learning using proximity comparison significantly improves robustness tracking, especially sequences exhibiting drastic appearance changes. furthermore, order prevent unbounded growth number training samples metric learning, design time-weighted reservoir sampling method maintain update limited-sized foreground background sample buffers balancing sample diversity adaptability. experimental results challenging videos demonstrate effectiveness robustness proposed tracker.",4 "investigation report auction mechanism design. auctions markets strict regulations governing information available traders market possible actions take. since well designed auctions achieve desirable economic outcomes, widely used solving real-world optimization problems, structuring stock futures exchanges. auctions also provide valuable testing-ground economic theory, play important role computer-based control systems. auction mechanism design aims manipulate rules auction order achieve specific goals. economists traditionally use mathematical methods, mainly game theory, analyze auctions design new auction forms. however, due high complexity auctions, mathematical models typically simplified obtain results, makes difficult apply results derived models market environments real world. result, researchers turning empirical approaches. report aims survey theoretical empirical approaches designing auction mechanisms trading strategies weights empirical ones, build foundation research field.",4 "flower pollination algorithm: novel approach multiobjective optimization. multiobjective design optimization problems require multiobjective optimization techniques solve, often challenging obtain high-quality pareto fronts accurately. paper, recently developed flower pollination algorithm (fpa) extended solve multiobjective optimization problems. proposed method used solve set multobjective test functions two bi-objective design benchmarks, comparison proposed algorithm algorithms made, shows fpa efficient good convergence rate. finally, importance parametric studies theoretical analysis highlighted discussed.",12 "polynomial neural networks learnt classify eeg signals. neural network based technique presented, able successfully extract polynomial classification rules labeled electroencephalogram (eeg) signals. represent classification rules analytical form, use polynomial neural networks trained modified group method data handling (gmdh). classification rules extracted clinical eeg data recorded alzheimer patient sudden death risk patients. third data eeg recordings include normal artifact segments. eeg data visually identified medical experts. extracted polynomial rules verified testing eeg data allow correctly classify 72% risk group patients 96.5% segments. rules performs slightly better standard feedforward neural networks.",4 "reducing computational cost multi-objective evolutionary algorithms filtering worthless individuals. large number exact fitness function evaluations makes evolutionary algorithms computational cost. real-world problems, reducing number evaluations much valuable even increasing computational complexity spending time. fulfill target, introduce effective factor, spite applied factor adaptive fuzzy fitness granulation non-dominated sorting genetic algorithm-ii, filter worthless individuals precisely. proposed approach compared respect adaptive fuzzy fitness granulation non-dominated sorting genetic algorithm-ii, using hyper volume inverted generational distance performance measures. proposed method applied 1 traditional 1 state-of-the-art benchmarks considering 3 different dimensions. average performance view, results indicate although decreasing number fitness evaluations leads performance reduction tangible compared gain.",4 "fractal dimension based optimal wavelet packet analysis technique classification meningioma brain tumours. heterogeneous nature tissue texture, using single resolution approach optimum classification might suffice. contrast, multiresolution wavelet packet analysis decompose input signal set frequency subbands giving opportunity characterise texture appropriate frequency channel. adaptive best bases algorithm optimal bases selection meningioma histopathological images proposed, via applying fractal dimension (fd) bases selection criterion tree-structured manner. thereby, significant subband better identifies texture discontinuities chosen decomposition, fractal signature would represent extracted feature vector classification. best basis selection using fd outperformed energy based selection approaches, achieving overall classification accuracy 91.25% compared 83.44% 73.75% co-occurrence matrix energy texture signatures; respectively.",4 "deep reinforcement learning unsupervised video summarization diversity-representativeness reward. video summarization aims facilitate large-scale video browsing producing short, concise summaries diverse representative original videos. paper, formulate video summarization sequential decision-making process develop deep summarization network (dsn) summarize videos. dsn predicts video frame probability, indicates likely frame selected, takes actions based probability distributions select frames, forming video summaries. train dsn, propose end-to-end, reinforcement learning-based framework, design novel reward function jointly accounts diversity representativeness generated summaries rely labels user interactions all. training, reward function judges diverse representative generated summaries are, dsn strives earning higher rewards learning produce diverse representative summaries. since labels required, method fully unsupervised. extensive experiments two benchmark datasets show unsupervised method outperforms state-of-the-art unsupervised methods, also comparable even superior published supervised approaches.",4 "assumptions behind dempster's rule. paper examines concept combination rule belief functions. shown two fairly simple apparently reasonable assumptions determine dempster's rule, giving new justification it.",4 "optimal algorithm thresholding bandit problem. study specific \textit{combinatorial pure exploration stochastic bandit problem} learner aims finding set arms whose means given threshold, given precision, \textit{for fixed time horizon}. propose parameter-free algorithm based original heuristic, prove optimal problem deriving matching upper lower bounds. best knowledge, first non-trivial pure exploration setting \textit{fixed budget} optimal strategies constructed.",19 survey deep learning techniques mobile robot applications. advancements deep learning years attracted research deep artificial neural networks used robotic systems. research survey present summarization current research specific focus gains obstacles deep learning applied mobile robotics.,4 "proceedings workshop brain analysis using connectivity networks - bacon 2016. understanding brain connectivity network-theoretic context shown much promise recent years. type analysis identifies brain organisational principles, bringing new perspective neuroscience. time, large public databases connectomic data available. however, connectome analysis still emerging field crucial need robust computational methods fully unravelits potential. workshop provides platform discuss development new analytic techniques; methods evaluating validating commonly used approaches; well effects variations pre-processing steps.",4 "forecasting sleep apnea dynamic network models. dynamic network models (dnms) belief networks temporal reasoning. dnm methodology combines techniques time series analysis probabilistic reasoning provide (1) knowledge representation integrates noncontemporaneous contemporaneous dependencies (2) methods iteratively refining dependencies response effects exogenous influences. use belief-network inference algorithms perform forecasting, control, discrete event simulation dnms. belief network formulation allows us move beyond traditional assumptions linearity relationships among time-dependent variables normality probability distributions. demonstrate dnm methodology important forecasting problem medicine. conclude discussion methodology addresses several limitations found traditional time series analyses.",4 "extended comment language trees zipping. extended version comment submitted physical review letters. first point inappropriateness publishing letter unrelated physics. next, give experimental results showing technique used letter 3 times worse 17 times slower simple baseline. finally, review literature, showing ideas letter novel. conclude suggesting physical review letters publish letters unrelated physics.",3 "automatic data deformation analysis evolving folksonomy driven environment. folksodriven framework makes possible data scientists define ontology environment searching buried patterns kind predictive power build predictive models effectively. accomplishes abstractions isolate parameters predictive modeling process searching patterns designing feature set, too. reflect evolving knowledge, paper considers ontologies based folksonomies according new concept structure called ""folksodriven"" represent folksonomies. so, studies transformational regulation folksodriven tags regarded important adaptive folksonomies classifications evolving environment used intelligent systems represent knowledge sharing. folksodriven tags used categorize salient data points fed machine-learning system ""featurizing"" data.",4 "meta networks. neural networks successfully applied applications large amount labeled data. however, task rapid generalization new concepts small training data preserving performances previously learned ones still presents significant challenge neural network models. work, introduce novel meta learning method, meta networks (metanet), learns meta-level knowledge across tasks shifts inductive biases via fast parameterization rapid generalization. evaluated omniglot mini-imagenet benchmarks, metanet models achieve near human-level performance outperform baseline approaches 6% accuracy. demonstrate several appealing properties metanet relating generalization continual learning.",4 "ltsg: latent topical skip-gram mutually learning topic model vector representations. topic models widely used discovering latent topics shared across documents text mining. vector representations, word embeddings topic embeddings, map words topics low-dimensional dense real-value vector space, obtained high performance nlp tasks. however, existing models assume result trained one perfect correct used prior knowledge improving model. models use information trained external large corpus help improving smaller corpus. paper, aim build algorithm framework makes topic models vector representations mutually improve within corpus. em-style algorithm framework employed iteratively optimize topic model vector representations. experimental results show model outperforms state-of-art methods various nlp tasks.",4 "heron inference bayesian graphical models. bayesian graphical models shown powerful tool discovering uncertainty causal structure real-world data many application fields. current inference methods primarily follow different kinds trade-offs computational complexity predictive accuracy. one end spectrum, variational inference approaches perform well computational efficiency, end, gibbs sampling approaches known relatively accurate prediction practice. paper, extend existing gibbs sampling method, propose new deterministic heron inference (heron) family bayesian graphical models. addition support nontrivial distributability, one benefit heron able allow us easily assess convergence status also largely improve running efficiency. evaluate heron standard collapsed gibbs sampler state-of-the-art state augmentation method inference well-known graphical models. experimental results using publicly available real-life data demonstrated heron significantly outperforms baseline methods inferring bayesian graphical models.",4 "all-in-one convolutional neural network face analysis. present multi-purpose algorithm simultaneous face detection, face alignment, pose estimation, gender recognition, smile detection, age estimation face recognition using single deep convolutional neural network (cnn). proposed method employs multi-task learning framework regularizes shared parameters cnn builds synergy among different domains tasks. extensive experiments show network better understanding face achieves state-of-the-art result tasks.",4 "frugal bribery voting. bribery elections important problem computational social choice theory. however, bribery money often illegal elections. motivated this, introduce notion frugal bribery formulate two new pertinent computational problems call frugal-bribery frugal- $bribery capture bribery without money elections. proposed model, briber frugal nature captured inability bribe votes certain kind, namely, non-vulnerable votes. frugal-bribery problem, goal make certain candidate win election changing vulnerable votes. frugal-{dollar}bribery problem, vulnerable votes prices goal make certain candidate win election changing vulnerable votes, subject budget constraint briber. formulate two natural variants frugal-{dollar}bribery problem namely uniform-frugal-{dollar}bribery nonuniform-frugal-{dollar}bribery prices vulnerable votes are, respectively, different. study computational complexity problems unweighted weighted elections several commonly used voting rules. observe that, even small number candidates, problems intractable voting rules studied weighted elections, sole exception frugal-bribery problem plurality voting rule. contrast, polynomial time algorithms frugal-bribery problem plurality, veto, k-approval, k-veto, plurality runoff voting rules unweighted elections. however, frugal-{dollar}bribery problem intractable voting rules studied barring plurality veto voting rules unweighted elections.",4 "generalised seizure prediction convolutional neural networks intracranial scalp electroencephalogram data analysis. seizure prediction attracted growing attention one challenging predictive data analysis efforts order improve life patients living drug-resistant epilepsy tonic seizures. many outstanding works reporting great results providing sensible indirect (warning systems) direct (interactive neural-stimulation) control refractory seizures, achieved high performance. however, many works put heavily handcraft feature extraction and/or carefully tailored feature engineering patient achieve high sensitivity low false prediction rate particular dataset. limits benefit approaches different dataset used. paper apply convolutional neural networks (cnns) different intracranial scalp electroencephalogram (eeg) datasets proposed generalized retrospective patient-specific seizure prediction method. use short-time fourier transform (stft) 30-second eeg windows 50% overlapping extract information frequency time domains. standardization step applied stft components across whole frequency range prevent high frequencies features influenced lower frequencies. convolutional neural network model used feature extraction classification separate preictal segments interictal ones. proposed approach achieves sensitivity 81.4%, 81.2%, 82.3% false prediction rate (fpr) 0.06/h, 0.16/h, 0.22/h freiburg hospital intracranial eeg (ieeg) dataset, children's hospital boston-mit scalp eeg (seeg) dataset, kaggle american epilepsy society seizure prediction challenge's dataset, respectively. prediction method also statistically better unspecific random predictor patients three datasets.",4 "mindx: denoising mixed impulse poisson-gaussian noise using proximal algorithms. present novel algorithm blind denoising images corrupted mixed impulse, poisson, gaussian noises. algorithm starts applying anscombe variance-stabilizing transformation convert poisson white gaussian noise. applies combinatorial optimization technique denoise mixed impulse gaussian noise using proximal algorithms. result processed inverse anscombe transform. compare algorithm state art methods standard images, show superior performance various noise conditions.",4 "data-dependent kernels nearly-linear time. propose method efficiently construct data-dependent kernels make use large quantities (unlabeled) data. construction makes approximation standard construction semi-supervised kernels sindhwani et al. 2005. typical cases kernels computed nearly-linear time (in amount data), improving cubic time standard construction, enabling large scale semi-supervised learning variety contexts. methods validated semi-supervised unsupervised problems data sets containing upto 64,000 sample points.",4 "sparse low-rank approximations large symmetric matrices using biharmonic interpolation. symmetric matrices widely used machine learning problems kernel machines manifold learning. using large datasets often requires computing low-rank approximations symmetric matrices fit memory. paper, present novel method based biharmonic interpolation low-rank matrix approximation. method exploits knowledge data manifold learn interpolation operator approximates values using subset randomly selected landmark points. operator readily sparsified, reducing memory requirements least two orders magnitude without significant loss accuracy. show method approximate large datasets using twenty times landmarks methods. further, numerical results suggest method stable even numerical difficulties arise methods.",19 "empowering olac extension using anusaaraka effective text processing using double byte coding. paper reviews hurdles trying implement olac extension dravidian / indian languages. paper explores possibilities could minimise solve problems. context, chinese system text processing anusaaraka system scrutinised.",4 "path-based vs. distributional information recognizing lexical semantic relations. recognizing various semantic relations terms beneficial many nlp tasks. path-based distributional information sources considered complementary task, superior results latter showed recently suggested former's contribution might become obsolete. follow recent success integrated neural method hypernymy detection (shwartz et al., 2016) extend recognize multiple relations. empirical results show method effective multiclass setting well. show path-based information source always contributes classification, analyze cases mostly complements distributional information.",4 "fusing continuous-valued medical labels using bayesian model. rapid increase volume time series medical data available wearable devices, need employ automated algorithms label data. examples labels include interventions, changes activity (e.g. sleep) changes physiology (e.g. arrhythmias). however, automated algorithms tend unreliable resulting lower quality care. expert annotations scarce, expensive, prone significant inter- intra-observer variance. address problems, bayesian continuous-valued label aggregator(bcla) proposed provide reliable estimation label aggregation accurately infer precision bias algorithm. bcla applied qt interval (pro-arrhythmic indicator) estimation electrocardiogram using labels 2006 physionet/computing cardiology challenge database. compared mean, median, previously proposed expectation maximization (em) label aggregation approaches. accurately predicting labelling algorithm's bias precision, root-mean-square error bcla 11.78$\pm$0.63ms, significantly outperforming best challenge entry (15.37$\pm$2.13ms) well em, mean, median voting strategies (14.76$\pm$0.52ms, 17.61$\pm$0.55ms, 14.43$\pm$0.57ms respectively $p<0.0001$).",4 "prior matters: simple general methods evaluating improving topic quality topic modeling. latent dirichlet allocation (lda) models trained without stopword removal often produce topics high posterior probabilities uninformative words, obscuring underlying corpus content. even canonical stopwords manually removed, uninformative words common corpus still dominate probable words topic. work, first show standard topic quality measures coherence pointwise mutual information act counter-intuitively presence common irrelevant words, making difficult even quantitatively identify situations topics may dominated stopwords. propose additional topic quality metric targets stopword problem, show it, unlike standard measures, correctly correlates human judgements quality. also propose simple-to-implement strategy generating topics evaluated much higher quality human assessment new metric. approach, collection informative priors easily introduced lda-style inference methods, automatically promotes terms domain relevance demotes domain-specific stop words. demonstrate approach's effectiveness three different domains: department labor accident reports, online health forum posts, nips abstracts. overall find current practices thought solve problem adequately, proposal offers substantial improvement interested interpreting topics objects right.",4 "stacking-based deep neural network: deep analytic network convolutional spectral histogram features. stacking-based deep neural network (s-dnn), general, denotes deep neural network (dnn) resemblance terms deep, feedforward network architecture. typical s-dnn aggregates variable number individually learnable modules series assemble dnn-alike alternative targeted object recognition tasks. work likewise devises s-dnn instantiation, dubbed deep analytic network (dan), top spectral histogram (sh) features. dan learning principle relies ridge regression, key dnn constituents, specifically, rectified linear unit, fine-tuning, normalization. dan aptitude scrutinized three repositories varying domains, including feret (faces), mnist (handwritten digits), cifar10 (natural objects). empirical results unveil dan escalates sh baseline performance sufficiently deep layer.",4 long-term evolution genetic programming populations. evolve binary mux-6 trees 100000 generations evolving programs hundred million nodes. unbounded long-term evolution experiment ltee gp appears evolve building blocks suggests limit bloat. see periods tens even hundreds generations population 100 percent functionally converged. distribution tree sizes predicted theory.,4 "distributed adaptive lmf algorithm sparse parameter estimation gaussian mixture noise. distributed adaptive algorithm estimation sparse unknown parameters presence nongaussian noise proposed paper based normalized least mean fourth (nlmf) criterion. first step, local adaptive nlmf algorithm modified zero norm order speed convergence rate also reduce steady state error power sparse conditions. then, proposed algorithm extended distributed scenario improvement estimation performance achieved due cooperation local adaptive filters. simulation results show superiority proposed algorithm comparison conventional nlmf algorithms.",4 "explorative data analysis changes neural activity. neural recordings nonstationary time series, i.e. properties typically change time. identifying specific changes, e.g. induced learning task, shed light underlying neural processes. however, changes interest often masked strong unrelated changes, physiological origin due measurement artifacts. propose novel algorithm disentangling different causes non-stationarity manner enable better neurophysiological interpretation wider set experimental paradigms. key ingredient repeated application stationary subspace analysis (ssa) using different temporal scales. usefulness explorative approach demonstrated simulations, theory eeg experiments 80 brain-computer-interfacing (bci) subjects.",16 "stepwise regression unsupervised learning. consider unsupervised extensions fast stepwise linear regression algorithm \cite{efroymson1960multiple}. extensions allow one efficiently identify highly-representative feature variable subsets within given set jointly distributed variables. turn allows efficient dimensional reduction large data sets via removal redundant features. fast search effected avoidance repeat computations across trial fits, allowing full representative-importance ranking set feature variables carried $o(n^2 m)$ time, $n$ number variables $m$ number data samples available. runtime complexity matches needed carry single regression $o(n^2)$ faster naive implementations. present pseudocode suitable efficient forward, reverse, forward-reverse unsupervised feature selection. illustrate algorithm's application, apply problem identifying representative stocks within given financial market index -- challenge relevant design exchange traded funds (etfs). also characterize growth numerical error iteration step algorithms, finally demonstrate rationalize observation forward reverse algorithms return exactly inverted feature orderings weakly-correlated feature set regime.",4 "missforest - nonparametric missing value imputation mixed-type data. modern data acquisition based high-throughput technology often facing problem missing data. algorithms commonly used analysis large-scale data often depend complete set. missing value imputation offers solution problem. however, majority available imputation methods restricted one type variable only: continuous categorical. mixed-type data different types usually handled separately. therefore, methods ignore possible relations variable types. propose nonparametric method cope different types variables simultaneously. compare several state art methods imputation missing values. propose evaluate iterative imputation method (missforest) based random forest. averaging many unpruned classification regression trees random forest intrinsically constitutes multiple imputation scheme. using built-in out-of-bag error estimates random forest able estimate imputation error without need test set. evaluation performed multiple data sets coming diverse selection biological fields artificially introduced missing values ranging 10% 30%. show missforest successfully handle missing values, particularly data sets including different types variables. comparative study missforest outperforms methods imputation especially data settings complex interactions nonlinear relations suspected. out-of-bag imputation error estimates missforest prove adequate settings. additionally, missforest exhibits attractive computational efficiency cope high-dimensional data.",19 "ensemble classifier approach breast cancer detection malignancy grading- review. diagnosed cases breast cancer increasing annually unfortunately getting converted high mortality rate. cancer, early stages, hard detect malicious cells show similar properties (density) shown non-malicious cells. mortality ratio could minimized breast cancer could detected early stages. current systems able achieve fully automatic system capable detecting breast cancer also detect stage it. estimation malignancy grading important diagnosing degree growth malicious cells well selecting proper therapy patient. therefore, complete efficient clinical decision support system proposed capable achieving breast cancer malignancy grading scheme efficiently. system based image processing machine learning domains. classification imbalance problem, machine learning problem, occurs instances one class much higher instances class resulting inefficient classification samples hence bad decision support system. therefore eusboost, ensemble based classifier proposed efficient able outperform classifiers takes benefits both-boosting algorithm random undersampling techniques. also comparison eusboost techniques shown paper.",4 "supervised feature selection diagnosis coronary artery disease based genetic algorithm. feature selection (fs) become focus much research decision support systems areas data sets tremendous number variables analyzed. paper present new method diagnosis coronary artery diseases (cad) founded genetic algorithm (ga) wrapped bayes naive (bn) based fs. basically, cad dataset contains two classes defined 13 features. ga bn algorithm, ga generates iteration subset attributes evaluated using bn second step selection procedure. final set attribute contains relevant feature model increases accuracy. algorithm case produces 85.50% classification accuracy diagnosis cad. thus, asset algorithm compared use support vector machine (svm), multilayer perceptron (mlp) c4.5 decision tree algorithm. result classification accuracy algorithms respectively 83.5%, 83.16% 80.85%. consequently, ga wrapped bn algorithm correspondingly compared fs algorithms. obtained results shown promising outcomes diagnosis cad.",4 "dynamic pricing demand covariates. consider firm sells products $t$ periods without knowing demand function. firm sequentially sets prices earn revenue learn underlying demand function simultaneously. natural heuristic problem, commonly used practice, greedy iterative least squares (gils). time period, gils estimates demand linear function price applying least squares set prior prices realized demands. price maximizes revenue, given estimated demand function, used next time period. performance measured regret, expected revenue loss optimal (oracle) pricing policy demand function known. recently, den boer zwart (2014) keskin zeevi (2014) demonstrated gils sub-optimal. introduced algorithms integrate forced price dispersion gils achieve asymptotically optimal performance. paper, consider dynamic pricing problem data-rich environment. particular, assume firm knows expected demand particular price historical data, period, setting price, firm access extra information (demand covariates) may predictive demand. prove setting gils achieves asymptotically optimal regret order $\log(t)$. also show following surprising result: original dynamic pricing problem den boer zwart (2014) keskin zeevi (2014), inclusion set covariates gils potential demand covariates (even though could carry information) would make gils asymptotically optimal. validate results via extensive numerical simulations synthetic real data sets.",19 "interactive restless multi-armed bandit game swarm intelligence effect. obtain conditions emergence swarm intelligence effect interactive game restless multi-armed bandit (rmab). player competes multiple agents. bandit payoff changes probability $p_{c}$ per round. agents player choose one three options: (1) exploit (a good bandit), (2) innovate (asocial learning good bandit among $n_{i}$ randomly chosen bandits), (3) observe (social learning good bandit). agent two parameters $(c,p_{obs})$ specify decision: (i) $c$, threshold value exploit, (ii) $p_{obs}$, probability observe learning. parameters $(c,p_{obs})$ uniformly distributed. determine optimal strategies player using complete knowledge rmab. show whether social asocial learning optimal $(p_{c},n_{i})$ space define swarm intelligence effect. conduct laboratory experiment (67 subjects) observe swarm intelligence effect $(p_{c},n_{i})$ chosen social learning far optimal asocial learning.",4 "learning-based approach automatic image video colorization. paper, present color transfer algorithm colorize broad range gray images without user intervention. algorithm uses machine learning-based approach automatically colorize grayscale images. algorithm uses superpixel representation reference color images learn relationship different image features corresponding color values. use learned information predict color value grayscale image superpixel. compared processing individual image pixels, use superpixels helps us achieve much higher degree spatial consistency well speeds colorization process. predicted color values gray-scale image superpixels used provide 'micro-scribble' centroid superpixels. color scribbles refined using voting based approach. generate final colorization result, use optimization-based approach smoothly spread color scribble across pixels within superpixel. experimental results broad range images comparison existing state-of-the-art colorization methods demonstrate greater effectiveness proposed algorithm.",4 "instance-level salient object segmentation. image saliency detection recently witnessed rapid progress due deep convolutional neural networks. however, none existing methods able identify object instances detected salient regions. paper, present salient instance segmentation method produces saliency mask distinct object instance labels input image. method consists three steps, estimating saliency map, detecting salient object contours identifying salient object instances. first two steps, propose multiscale saliency refinement network, generates high-quality salient region masks salient object contours. integrated multiscale combinatorial grouping map-based subset optimization framework, method generate promising salient object instance segmentation results. promote research evaluation salient instance segmentation, also construct new database 1000 images pixelwise salient instance annotations. experimental results demonstrate proposed method capable achieving state-of-the-art performance public benchmarks salient region detection well new dataset salient instance segmentation.",4 "associative memories based multiple-valued sparse clustered networks. associative memories structures store data patterns retrieve given partial inputs. sparse clustered networks (scns) recently-introduced binary-weighted associative memories significantly improve storage retrieval capabilities prior state-of-the art. however, deleting updating data patterns result significant increase data retrieval error probability. paper, propose algorithm address problem incorporating multiple-valued weights interconnections used network. proposed algorithm lowers error rate order magnitude sample network 60% deleted contents. investigate advantages proposed algorithm hardware implementations.",4 "robustness semantic segmentation models adversarial attacks. deep neural networks (dnns) demonstrated perform exceptionally well recognition tasks image classification segmentation. however, also shown vulnerable adversarial examples. phenomenon recently attracted lot attention extensively studied multiple, large-scale datasets complex tasks semantic segmentation often require specialised networks additional components crfs, dilated convolutions, skip-connections multiscale processing. paper, present knowledge first rigorous evaluation adversarial attacks modern semantic segmentation models, using two large-scale datasets. analyse effect different network architectures, model capacity multiscale processing, show many observations made task classification always transfer complex task. furthermore, show mean-field inference deep structured models multiscale processing naturally implement recently proposed adversarial defenses. observations aid future efforts understanding defending adversarial examples. moreover, shorter term, show segmentation models currently preferred safety-critical applications due inherent robustness.",4 "pomdp-lite robust robot planning uncertainty. partially observable markov decision process (pomdp) provides principled general model planning uncertainty. however, solving general pomdp computationally intractable worst case. paper introduces pomdp-lite, subclass pomdps hidden state variables constant change deterministically. show pomdp-lite equivalent set fully observable markov decision processes indexed hidden parameter useful modeling variety interesting robotic tasks. develop simple model-based bayesian reinforcement learning algorithm solve pomdp-lite models. algorithm performs well large-scale pomdp-lite models $10^{20}$ states outperforms state-of-the-art general-purpose pomdp algorithms. show algorithm near-bayesian-optimal suitable conditions.",4 "oracle mcg: first peek coco detection challenges. recently presented coco detection challenge probably reference benchmark object detection next years. coco two orders magnitude larger pascal four times number categories; likelihood researchers faced number new challenges. point, without finished round competition, difficult researchers put techniques context, words, know good results are. order give little context, note evaluates hypothetical object detector consisting oracle picking best object proposal state-of-the-art technique. oracle achieves ap=0.292 segmented objects ap=0.317 bounding boxes, showing indeed database challenging, given value best one expect working object proposals without refinement.",4 "short term load forecasting models czech republic using soft computing paradigms. paper presents comparative study six soft computing models namely multilayer perceptron networks, elman recurrent neural network, radial basis function network, hopfield model, fuzzy inference system hybrid fuzzy neural network hourly electricity demand forecast czech republic. soft computing models trained tested using actual hourly load data seven years. comparison proposed techniques presented predicting 2 day ahead demands electricity. simulation results indicate hybrid fuzzy neural network radial basis function networks best candidates analysis forecasting electricity demand.",4 "differentiable submodular maximization. consider learning submodular functions data. functions important machine learning wide range applications, e.g. data summarization, feature selection active learning. despite combinatorial nature, submodular functions maximized approximately strong theoretical guarantees polynomial time. typically, learning submodular function optimization function treated separately, i.e. function first learned using proxy objective subsequently maximized. contrast, show perform learning optimization jointly. interpreting output greedy maximization algorithms distributions sequences items smoothening distributions, obtain differentiable objective. way, differentiate maximization algorithms optimize model work well optimization algorithm. theoretically characterize error made approach, yielding insights trade-off smoothness accuracy. demonstrate effectiveness approach jointly learning optimizing synthetic maxcut data, real world product recommendation application.",19 "shape estimation defocus cue microscopy images via belief propagation. recent years, usefulness 3d shape estimation realized microscopic close-range imaging, 3d information used various applications. due limited depth field small distances, defocus blur induced images provide information 3d shape object. task `shape defocus' (sfd), involves problem estimating good quality 3d shape estimates images depth-dependent defocus blur. research area sfd quite well-established, approaches largely demonstrated results objects bulk/coarse shape variation. however, many cases, objects studied microscopes often involve fine/detailed structures, explicitly considered methods. addition, given that, recent years, large data volumes typically associated microscopy related applications, also important sfd methods efficient. work, provide indication usefulness belief propagation (bp) approach addressing concerns sfd. bp known efficient combinatorial optimization approach, empirically demonstrated yield good quality solutions low-level vision problems image restoration, stereo disparity estimation etc. exploiting efficiency bp sfd, assume local space-invariance defocus blur, enables application bp straightforward manner. even assumption, ability bp provide good quality solutions using non-convex priors, reflects yielding plausible shape estimates presence fine structures objects microscopy imaging.",4 "polyploidy discontinuous heredity effect evolutionary multi-objective optimization. paper examines effect mimicking discontinuous heredity caused carrying one chromosome living organisms cells evolutionary multi-objective optimization algorithms. representation, phenotype may fully reflect genotype. mimicking living organisms inheritance mechanism, traits may silently carried many generations reappear later. representations different number chromosomes solution vector tested different benchmark problems high number decision variables objectives. comparison non-dominated sorting genetic algorithm-ii done problems.",4 "preliminary report structure croatian linguistic co-occurrence networks. article, investigate structure croatian linguistic co-occurrence networks. examine change network structure properties systematically varying co-occurrence window sizes, corpus sizes removing stopwords. co-occurrence window size $n$ establish link current word $n-1$ subsequent words. results point increase co-occurrence window size followed decrease diameter, average path shortening expectedly condensing average clustering coefficient. noticed removal stopwords. finally, since size texts reflected network properties, results suggest corpus influence reduced increasing co-occurrence window size.",4 "boosting presence label noise. boosting known sensitive label noise. studied two approaches improve adaboost's robustness labelling errors. one employ label-noise robust classifier base learner, modify adaboost algorithm robust. empirical evaluation shows committee robust classifiers, although converges faster non label-noise aware adaboost, still susceptible label noise. however, pairing new robust boosting algorithm propose results resilient algorithm mislabelling.",4 "nonparametric nearest neighbor descent clustering based delaunay triangulation. physically inspired in-tree (it) based clustering algorithm series it, one free parameter involved computing potential value point. work, based delaunay triangulation dual voronoi tessellation, propose nonparametric process compute potential values local information. computation, though nonparametric, relatively rough, consequently, many local extreme points generated. however, unlike gradient-based methods, it-based methods generally insensitive local extremes. positively demonstrates superiority parametric (previous) nonparametric (in work) it-based methods.",19 "universal intelligence: definition machine intelligence. fundamental problem artificial intelligence nobody really knows intelligence is. problem especially acute need consider artificial systems significantly different humans. paper approach problem following way: take number well known informal definitions human intelligence given experts, extract essential features. mathematically formalised produce general measure intelligence arbitrary machines. believe equation formally captures concept machine intelligence broadest reasonable sense. show formal definition related theory universal optimal learning agents. finally, survey many tests definitions intelligence proposed machines.",4 "optimizing human-interpretable dialog management policy using genetic algorithm. automatic optimization spoken dialog management policies robust environmental noise long goal academia industry. approaches based reinforcement learning proved effective. however, numerical representation dialog policy human-incomprehensible difficult dialog system designers verify modify, limits practical application. paper propose novel framework optimizing dialog policies specified domain language using genetic algorithm. human-interpretable representation policy makes method suitable practical employment. present learning algorithms using user simulation real human-machine dialogs respectively.empirical experimental results given show effectiveness proposed approach.",4 "cuckoo search: brief literature review. cuckoo search (cs) introduced 2009, attracted great attention due promising efficiency solving many optimization problems real-world applications. last years, many papers published regarding cuckoo search, relevant literature expanded significantly. chapter summarizes briefly majority literature cuckoo search peer-reviewed journals conferences found far. references systematically classified appropriate categories, used basis research.",12 "learning neural markers schizophrenia disorder using recurrent neural networks. smart systems accurately diagnose patients mental disorders identify effective treatments based brain functional imaging data great applicability gaining much attention. previous machine learning studies use hand-designed features, functional connectivity, maintain potential useful information spatial relationship brain regions temporal profile signal region. propose new method based recurrent-convolutional neural networks automatically learn useful representations segments 4-d fmri recordings. goal exploit spatial temporal information functional mri movie (at whole-brain voxel level) identifying patients schizophrenia.",4 "detection tracking general movable objects large 3d maps. paper studies problem detection tracking general objects long-term dynamics, observed mobile robot moving large environment. key problem due environment scale, observe subset objects given time. since time passes observations objects different places, objects might moved robot there. propose model movement objects typically move locally, small probability jump longer distances, call global motion. filtering, decompose posterior local global movements two linked processes. posterior global movements measurement associations sampled, track local movement analytically using kalman filters. novel filter evaluated point cloud data gathered autonomously mobile robot extended period time. show tracking jumping objects feasible, proposed probabilistic treatment outperforms previous methods applied real world data. key efficient probabilistic tracking scenario focused sampling object posteriors.",4 "sparse bilinear logistic regression. paper, introduce concept sparse bilinear logistic regression decision problems involving explanatory variables two-dimensional matrices. problems common computer vision, brain-computer interfaces, style/content factorization, parallel factor analysis. underlying optimization problem bi-convex; study solution develop efficient algorithm based block coordinate descent. provide theoretical guarantee global convergence estimate asymptotical convergence rate using kurdyka-{\l}ojasiewicz inequality. range experiments simulated real data demonstrate sparse bilinear logistic regression outperforms current techniques several important applications.",12 "training fully convolutional neural network route integrated circuits. present deep, fully convolutional neural network learns route circuit layout net appropriate choice metal tracks wire class combinations. inputs network encoded layouts containing spatial location pins routed. 15 fully convolutional stages followed score comparator, network outputs 8 layout layers (corresponding 4 route layers, 3 via layers identity-mapped pin layer) decoded obtain routed layouts. formulate binary segmentation problem per-pixel per-layer basis, network trained correctly classify pixels layout layer 'on' 'off'. demonstrate learnability layout design rules, train network dataset 50,000 train 10,000 validation samples generate based certain pre-defined layout constraints. precision, recall $f_1$ score metrics used track training progress. network achieves $f_1\approx97\%$ train set $f_1\approx92\%$ validation set. use pytorch implementing model. code made publicly available https://github.com/sjain-stanford/deep-route .",4 "analysis design pattern teaching features labels. study task teaching machine classify objects using features labels. introduce error-driven-featuring design pattern teaching using features labels teacher prefers introduce features needed. analyze potential risks benefits teaching pattern use teaching protocols, illustrative examples, providing bounds effort required optimal machine teacher using linear learning algorithm, commonly used type learners interactive machine learning systems. analysis provides deeper understanding potential trade-offs using different learning algorithms effort required featuring (creating new features) labeling (providing labels objects).",4 "multifractal analysis sentence lengths english literary texts. paper presents analysis 30 literary texts written english different authors. text, created time series representing length sentences words analyzed fractal properties using two methods multifractal analysis: mfdfa wtmm. methods showed texts considered multifractal representation majority texts multifractal even fractal all. 30 books, so-correlated lengths consecutive sentences analyzed signals interpreted real multifractals. interesting direction future investigations would identifying specific features cause certain texts multifractal monofractal even fractal all.",15 "empirical evaluation four tensor decomposition algorithms. higher-order tensor decompositions analogous familiar singular value decomposition (svd), transcend limitations matrices (second-order tensors). svd powerful tool achieved impressive results information retrieval, collaborative filtering, computational linguistics, computational vision, fields. however, svd limited two-dimensional arrays data (two modes), many potential applications three modes, require higher-order tensor decompositions. paper evaluates four algorithms higher-order tensor decomposition: higher-order singular value decomposition (ho-svd), higher-order orthogonal iteration (hooi), slice projection (sp), multislice projection (mp). measure time (elapsed run time), space (ram disk space requirements), fit (tensor reconstruction accuracy) four algorithms, variety conditions. find standard implementations ho-svd hooi scale larger tensors, due increasing ram requirements. recommend hooi tensors small enough available ram mp larger tensors.",4 "classifying visualizing motion capture sequences using deep neural networks. gesture recognition using motion capture data depth sensors recently drawn attention vision recognition. currently systems classify dataset couple dozens different actions. moreover, feature extraction data often computational complex. paper, propose novel system recognize actions skeleton data simple, effective, features using deep neural networks. features extracted frame based relative positions joints (po), temporal differences (td), normalized trajectories motion (nt). given features hybrid multi-layer perceptron trained, simultaneously classifies reconstructs input data. use deep autoencoder visualize learnt features, experiments show deep neural networks capture discriminative information than, instance, principal component analysis can. test system public database 65 classes 2,000 motion sequences. obtain accuracy 95% is, knowledge, state art result large dataset.",4 "interpretable classifiers using rules bayesian analysis: building better stroke prediction model. aim produce predictive models accurate, also interpretable human experts. models decision lists, consist series if...then... statements (e.g., high blood pressure, stroke) discretize high-dimensional, multivariate feature space series simple, readily interpretable decision statements. introduce generative model called bayesian rule lists yields posterior distribution possible decision lists. employs novel prior structure encourage sparsity. experiments show bayesian rule lists predictive accuracy par current top algorithms prediction machine learning. method motivated recent developments personalized medicine, used produce highly accurate interpretable medical scoring systems. demonstrate producing alternative chads$_2$ score, actively used clinical practice estimating risk stroke patients atrial fibrillation. model interpretable chads$_2$, accurate.",19 "reasoning uncertainty: monte carlo results. series monte carlo studies performed compare behavior alternative procedures reasoning uncertainty. behavior several bayesian, linear model default reasoning procedures examined context increasing levels calibration error. interesting result bayesian procedures tended output extreme posterior belief values (posterior beliefs near 0.0 1.0) techniques, linear models relatively less likely output strong support erroneous conclusion. also, accounting probabilistic dependencies evidence items important bayesian linear updating procedures.",4 "outer-product hidden markov model polyphonic midi score following. present polyphonic midi score-following algorithm capable following performances arbitrary repeats skips, based probabilistic model musical performances. attractive practical applications score following handle repeats skips may made arbitrarily performances, algorithms previously described literature cannot applied scores practical length due problems large computational complexity. propose new type hidden markov model (hmm) performance model describe arbitrary repeats skips including performer tendencies distributed score positions them, derive efficient score-following algorithm reduces computational complexity without pruning. theoretical discussion much information performer tendencies improves score-following results given. proposed score-following algorithm also admits performance mistakes demonstrated effective practical situations carrying evaluations human performances. proposed hmm potentially valuable topics information processing also provide detailed description inference algorithms.",4 "cognitive architecture direction attention founded subliminal memory searches, pseudorandom nonstop. way explaining brain works logically, human associative memory modeled logical memory neurons, corresponding standard digital circuits. resulting cognitive architecture incorporates basic psychological elements short term long term memory. novel architecture memory searches using cues chosen pseudorandomly short term memory. recalls alternated sensory images, many tens per second, analyzed subliminally ongoing process, determine direction attention short term memory.",4 "3d shape estimation 2d landmarks: convex relaxation approach. investigate problem estimating 3d shape object, given set 2d landmarks single image. alleviate reconstruction ambiguity, widely-used approach confine unknown 3d shape within shape space built upon existing shapes. approach proven successful various applications, challenging issue remains, i.e., joint estimation shape parameters camera-pose parameters requires solve nonconvex optimization problem. existing methods often adopt alternating minimization scheme locally update parameters, consequently solution sensitive initialization. paper, propose convex formulation address problem develop efficient algorithm solve proposed convex program. demonstrate exact recovery property proposed method, merits compared alternative methods, applicability human pose car shape estimation.",4 "detecting overlapping temporal community structure time-evolving networks. present principled approach detecting overlapping temporal community structure dynamic networks. method based following framework: find overlapping temporal community structure maximizes quality function associated snapshot network subject temporal smoothness constraint. novel quality function smoothness constraint proposed handle overlaps, new convex relaxation used solve resulting combinatorial optimization problem. provide theoretical guarantees well experimental results reveal community structure real synthetic networks. main insight certain structures identified temporal correlation considered communities allowed overlap. general, discovering overlapping temporal community structure enhance understanding real-world complex networks revealing underlying stability behind seemingly chaotic evolution.",4 "anmm: ranking short answer texts attention-based neural matching model. alternative question answering methods based feature engineering, deep learning approaches convolutional neural networks (cnns) long short-term memory models (lstms) recently proposed semantic matching questions answers. achieve good results, however, models combined additional features word overlap bm25 scores. without combination, models perform significantly worse methods based linguistic feature engineering. paper, propose attention based neural matching model ranking short answer text. adopt value-shared weighting scheme instead position-shared weighting scheme combining different matching signals incorporate question term importance learning using question attention network. using popular benchmark trec qa data, show relatively simple anmm model significantly outperform neural network models used question answering task, competitive models combined additional features. anmm combined additional features, outperforms baselines.",4 "learning hierarchical latent-variable model 3d shapes. propose variational shape learner (vsl), hierarchical latent-variable model 3d shape learning. vsl employs unsupervised approach learning inferring underlying structure voxelized 3d shapes. use skip-connections, model successfully learn latent, hierarchical representation objects. furthermore, realistic 3d objects easily generated sampling vsl's latent probabilistic manifold. show generative model trained end-to-end 2d images perform single image 3d model retrieval. experiments show, quantitatively qualitatively, improved performance proposed model range tasks.",4 "meta-qsar: large-scale application meta-learning drug design discovery. investigate learning quantitative structure activity relationships (qsars) case-study meta-learning. application area highest societal importance, key step development new medicines. standard qsar learning problem is: given target (usually protein) set chemical compounds (small molecules) associated bioactivities (e.g. inhibition target), learn predictive mapping molecular representation activity. although almost every type machine learning method applied qsar learning agreed single best way learning qsars, therefore problem area well-suited meta-learning. first carried comprehensive ever comparison machine learning methods qsar learning: 18 regression methods, 6 molecular representations, applied 2,700 qsar problems. (these results made publicly available openml represent valuable resource testing novel meta-learning methods.) investigated utility algorithm selection qsar problems. found meta-learning approach outperformed best individual qsar learning method (random forests using molecular fingerprint representation) 13%, average. conclude meta-learning outperforms base-learning methods qsar learning, investigation one extensive ever comparisons base meta-learning methods ever made, provides evidence general effectiveness meta-learning base-learning.",4 "new learning paradigm random vector functional-link network: rvfl+. school, teacher plays important role various classroom teaching patterns. likewise human learning activity, learning using privileged information (lupi) paradigm provides additional information generated teacher 'teach' learning algorithms training stage. therefore, novel learning paradigm typical teacher-student interaction mechanism. paper first present random vector functional link network based lupi paradigm, called rvfl+. rather simply combining two existing approaches, newly-derived rvfl+ fills gap neural networks lupi paradigm, offers alternative way train rvfl networks. moreover, proposed rvfl+ perform conjunction kernel trick highly complicated nonlinear feature learning, termed krvfl+. furthermore, statistical property proposed rvfl+ investigated, derive sharp high-quality generalization error bound based rademacher complexity. competitive experimental results 14 real-world datasets illustrate great effectiveness efficiency novel rvfl+ krvfl+, achieve better generalization performance state-of-the-art algorithms.",19 "behavior path planning coalition cognitive robots smart relocation tasks. paper outline approach solving special type navigation tasks robotic systems, coalition robots (agents) acts 2d environment, modified actions, share goal location. latter originally unreachable members coalition, common task still accomplished agents assist (e.g. modifying environment). call tasks smart relocation tasks (as solved pure path planning methods) study spatial behavior interaction robots solving them. use cognitive approach introduce semiotic knowledge representation - sign world model underlines behavioral planning methodology. planning viewed recursive search process hierarchical state-space induced sings path planning signs reside lowest level. reaching level triggers path planning accomplished state art grid-based planners focused producing smooth paths (e.g. lian) thus indirectly guarantying feasibility paths agent's dynamic constraints.",4 "bayesian filtering odes bounded derivatives. recently increasing interest probabilistic solvers ordinary differential equations (odes) return full probability measures, instead point estimates, solution incorporate uncertainty ode hand, e.g. vector field initial value approximately known evaluable. ode filter proposed recent work models solution ode gauss-markov process serves prior sense bayesian statistics. previous work employed wiener process prior (possibly multiple times) differentiated solution ode established equivalence corresponding solver classical numerical methods, paper raises question whether priors also yield practically useful solvers. end, discuss range possible priors enable fast filtering propose new prior--the integrated ornstein uhlenbeck process (ioup)--that complements existing integrated wiener process (iwp) filter encoding property derivative time solution bounded sense tends drift back zero. provide experiments comparing iwp ioup filters support belief iwp approximates better divergent ode's solutions whereas ioup better prior trajectories bounded derivatives.",4 "safedrive: robust lane tracking system autonomous assisted driving limited visibility. present approach towards robust lane tracking assisted autonomous driving, particularly poor visibility. autonomous detection lane markers improves road safety, purely visual tracking desirable widespread vehicle compatibility reducing sensor intrusion, cost, energy consumption. however, visual approaches often ineffective number factors, including limited occlusion, poor weather conditions, paint wear-off. method, named safedrive, attempts improve visual lane detection approaches drastically degraded visual conditions without relying additional active sensors. scenarios visual lane detection algorithms unable detect lane markers, proposed approach uses location information vehicle locate access alternate imagery road attempts detection secondary image. subsequently, using combination feature-based pixel-based alignment, estimated location lane marker found current scene. demonstrate effectiveness system actual driving data locations united states google street view source alternate imagery.",4 "stance classification rumours sequential task exploiting tree structure social media conversations. rumour stance classification, task determines tweet collection discussing rumour supporting, denying, questioning simply commenting rumour, attracting substantial interest. introduce novel approach makes use sequence transitions observed tree-structured conversation threads twitter. conversation threads formed harvesting users' replies one another, results nested tree-like structure. previous work addressing stance classification task treated tweet separate unit. analyse tweets virtue position sequence test two sequential classifiers, linear-chain crf tree crf, makes different assumptions conversational structure. experiment eight twitter datasets, collected breaking news, show exploiting sequential structure twitter conversations achieves significant improvements non-sequential methods. work first model twitter conversations tree structure manner, introducing novel way tackling nlp tasks twitter conversations.",4 "learning conditional independence structure high-dimensional uncorrelated vector processes. formulate analyze graphical model selection method inferring conditional independence graph high-dimensional nonstationary gaussian random process (time series) finite-length observation. observed process samples assumed uncorrelated time time-varying marginal distribution. selection method based testing conditional variances obtained small subsets process components. allows cope high-dimensional regime, sample size (drastically) smaller process dimension. characterize required sample size proposed selection method successful high probability.",19 "optimal sparse linear auto-encoders sparse pca. principal components analysis (pca) optimal linear auto-encoder data, often used construct features. enforcing sparsity principal components promote better generalization, improving interpretability features. study problem constructing optimal sparse linear auto-encoders. two natural questions setting are: i) given level sparsity, best approximation pca achieved? ii) low-order polynomial-time algorithms asymptotically achieve optimal tradeoff sparsity approximation quality? work, answer questions giving efficient low-order polynomial-time algorithms constructing asymptotically \emph{optimal} linear auto-encoders (in particular, sparse features near-pca reconstruction error) demonstrate performance algorithms real data.",4 "transfer learning, soft distance-based bias, hierarchical boa. automated technique recently proposed transfer learning hierarchical bayesian optimization algorithm (hboa) based distance-based statistics. technique enables practitioners improve hboa efficiency collecting statistics probabilistic models obtained previous hboa runs using obtained statistics bias future hboa runs similar problems. purpose paper threefold: (1) test technique several classes np-complete problems, including maxsat, spin glasses minimum vertex cover; (2) demonstrate technique effective even previous runs done problems different size; (3) provide empirical evidence combining transfer learning efficiency enhancement techniques often yield nearly multiplicative speedups.",4 "weight initialization deep neural networks(dnns) using data statistics. deep neural networks (dnns) form backbone almost every state-of-the-art technique fields computer vision, speech processing, text analysis. recent advances computational technology made use dnns practical. despite overwhelming performances dnn advances computational technology, seen researchers try train models scratch. training dnns still remains difficult tedious job. main challenges researchers face training dnns vanishing/exploding gradient problem highly non-convex nature objective function million variables. approaches suggested xavier solve vanishing gradient problem providing sophisticated initialization technique. approaches quite effective achieved good results standard datasets, approaches work well practical datasets. think reason making use data statistics initializing network weights. optimizing high dimensional loss function requires careful initialization network weights. work, propose data dependent initialization analyze performance standard initialization techniques xavier. performed experiments practical datasets results show algorithm's superior classification accuracy.",4 "logical stochastic optimization. present logical framework represent reason stochastic optimization problems based probability answer set programming. established allowing probability optimization aggregates, e.g., minimum maximum language probability answer set programming allow minimization maximization desired criteria probabilistic environments. show application proposed logical stochastic optimization framework probability answer set programming two stages stochastic optimization problems recourse.",4 "inferring disease gene set associations rank coherence networks. computational challenge validate candidate disease genes identified high-throughput genomic study elucidate associations set candidate genes disease phenotypes. conventional gene set enrichment analysis often fails reveal associations disease phenotypes gene sets short list poorly annotated genes, existing annotations disease causative genes incomplete. propose network-based computational approach called rcnet discover associations gene sets disease phenotypes. assuming coherent associations genes ranked relevance query gene set, disease phenotypes ranked relevance hidden target disease phenotypes query gene set, formulate learning framework maximizing rank coherence respect known disease phenotype-gene associations. efficient algorithm coupling ridge regression label propagation, two variants introduced find optimal solution framework. evaluated rcnet algorithms existing baseline methods leave-one-out cross-validation task predicting recently discovered disease-gene associations omim. experiments demonstrated rcnet algorithms achieved best overall rankings compared baselines. validate reproducibility performance, applied algorithms identify target diseases novel candidate disease genes obtained recent studies gwas, dna copy number variation analysis, gene expression profiling. algorithms ranked target disease candidate genes top rank list many cases across three case studies. rcnet algorithms available webtool disease gene set association analysis http://compbio.cs.umn.edu/dgsa_rcnet.",16 "statistical analysis loopy belief propagation random fields. loopy belief propagation (lbp), equivalent bethe approximation statistical mechanics, message-passing-type inference method widely used analyze systems based markov random fields (mrfs). paper, propose message-passing-type method analytically evaluate quenched average lbp random fields using replica cluster variation method. proposed analytical method applicable general pair-wise mrfs random fields whose distributions differ give quenched averages bethe free energies random fields, consistent numerical results. order computational cost equivalent standard lbp. latter part paper, describe application proposed method bayesian image restoration, observed theoretical results good agreement numerical results natural images.",19 "second croatian computer vision workshop (ccvw 2013). proceedings second croatian computer vision workshop (ccvw 2013, http://www.fer.unizg.hr/crv/ccvw2013) held september 19, 2013, zagreb, croatia. workshop organized center excellence computer vision university zagreb.",4 "morphologic knowledge dynamics: revision, fusion, abduction. several tasks artificial intelligence require able find models knowledge dynamics. include belief revision, fusion belief merging, abduction. paper exploit algebraic framework mathematical morphology context propositional logic, define operations dilation erosion set formulas. derive concrete operators, based semantic approach, intuitive interpretation formally well behaved, perform revision, fusion abduction. computation tractability addressed, simple examples illustrate typical results obtained.",4 "prediction-adaptation-correction recurrent neural networks low-resource language speech recognition. paper, investigate use prediction-adaptation-correction recurrent neural networks (pac-rnns) low-resource speech recognition. pac-rnn comprised pair neural networks {\it correction} network uses auxiliary information given {\it prediction} network help estimate state probability. information correction network also used prediction network recurrent loop. model outperforms state-of-the-art neural networks (dnns, lstms) iarpa-babel tasks. moreover, transfer learning language similar target language help improve performance further.",4 "understanding convolutional networks apple : automatic patch pattern labeling explanation. success deep learning, recent efforts focused analyzing learned networks make classifications. interested analyzing network output based network structure information flow network layers. contribute algorithm 1) analyzing deep network find neurons 'important' terms network classification outcome, 2)automatically labeling patches input image activate important neurons. propose several measures importance neurons demonstrate technique used gain insight into, explain network decomposes image make final classification.",4 "representing reasoning probabilistic knowledge: bayesian approach. pagoda (probabilistic autonomous goal-directed agent) model autonomous learning probabilistic domains [desjardins, 1992] incorporates innovative techniques using agent's existing knowledge guide constrain learning process representing, reasoning with, learning probabilistic knowledge. paper describes probabilistic representation inference mechanism used pagoda. pagoda forms theories effects actions world state environment time. theories represented conditional probability distributions. restriction imposed structure theories allows inference mechanism find unique predicted distribution action world state description. restricted theories called uniquely predictive theories. inference mechanism, probability combination using independence (pci), uses minimal independence assumptions combine probabilities theory make probabilistic predictions.",4 "alternative gospel structure: order, composition, processes. survey basic mathematical structures, arguably primitive structures taught school. structures orders, without composition, (symmetric) monoidal categories. list several `real life' incarnations these. paper also serves introduction structures current potentially future uses linguistics, physics knowledge representation.",12 "model-free episodic control. state art deep reinforcement learning algorithms take many millions interactions attain human-level performance. humans, hand, quickly exploit highly rewarding nuances environment upon first discovery. brain, rapid learning thought depend hippocampus capacity episodic memory. investigate whether simple model hippocampal episodic control learn solve difficult sequential decision-making tasks. demonstrate attains highly rewarding strategy significantly faster state-of-the-art deep reinforcement learning algorithms, also achieves higher overall reward challenging domains.",19 "learning classify possible sensor failures. paper, propose general framework learn robust large-margin binary classifier corrupt measurements, called anomalies, caused sensor failure might present training set. goal minimize generalization error classifier non-corrupted measurements controlling false alarm rate associated anomalous samples. incorporating non-parametric regularizer based empirical entropy estimator, propose geometric-entropy-minimization regularized maximum entropy discrimination (gem-med) method learn classify detect anomalies joint manner. demonstrate using simulated data real multimodal data set. gem-med method yield improved performance previous robust classification methods terms classification accuracy anomaly detection rate.",4 "large margin image set representation classification. paper, propose novel image set representation classification method maximizing margin image sets. margin image set defined difference distance nearest image set different classes distance nearest image set class. modeling image sets using image samples affine hull models, maximizing margins images sets, image set representation parameter learning problem formulated minimization problem, optimized expectation -maximization (em) strategy accelerated proximal gradient (apg) optimization iterative algorithm. classify given test image set, assign class could provide largest margin. experiments two applications video-sequence-based face recognition demonstrate proposed method significantly outperforms state-of-the-art image set classification methods terms effectiveness efficiency.",4 "belief merging source reliability assessment. merging beliefs requires plausibility sources information merged. typically assumed equally reliable lack hints indicating otherwise; yet, recent line research spun idea deriving information revision process itself. particular, history previous revisions previous merging examples provide information performing subsequent mergings. yet, examples previous revisions may available. spite apparent lack information, something still inferred try-and-check approach: relative reliability ordering assumed, merging process performed based it, result compared original information. outcome check may incoherent initial assumption, like completely reliable source rejected information provided. cases, reliability ordering assumed first place excluded consideration. first theorem article proves scenario indeed possible. results obtained various definition reliability merging.",4 "detailed, accurate, human shape estimation clothed 3d scan sequences. address problem estimating human pose body shape 3d scans time. reliable estimation 3d body shape necessary many applications including virtual try-on, health monitoring, avatar creation virtual reality. scanning bodies minimal clothing, however, presents practical barrier applications. address problem estimating body shape clothing sequence 3d scans. previous methods exploited body models produce smooth shapes lacking personalized details. contribute new approach recover personalized shape person. estimated shape deviates parametric model fit 3d scans. demonstrate method using high quality 4d data well sequences visual hulls extracted multi-view images. also make available buff, new 4d dataset enables quantitative evaluation (http://buff.is.tue.mpg.de). method outperforms state art pose estimation shape estimation, qualitatively quantitatively.",4 "efficient methods unsupervised learning probabilistic models. thesis develop variety techniques train, evaluate, sample intractable high dimensional probabilistic models. abstract exceeds arxiv space limitations -- see pdf.",4 "dictionary based approach edge detection. edge detection essential part image processing, quality accuracy detection determines success processing. developed new self learning technique edge detection using dictionary comprised eigenfilters constructed using features input image. dictionary based method eliminates need pre post processing image accounts noise, blurriness, class image variation illumination detection process itself. since, method depends characteristics image, new technique detect edges accurately capture greater detail existing algorithms sobel, prewitt laplacian gaussian, canny method etc use generic filters operators. demonstrated application various classes images text, face, barcodes, traffic cell images. application technique cell counting microscopic image also presented.",4 "hilbert space embeddings pomdps. nonparametric approach policy learning pomdps proposed. approach represents distributions states, observations, actions embeddings feature spaces, reproducing kernel hilbert spaces. distributions states given observations obtained applying kernel bayes' rule distribution embeddings. policies value functions defined feature space states, leads feature space expression bellman equation. value iteration may used estimate optimal value function associated policy. experimental results confirm correct policy learned using feature space representation.",4 "theoretical framework robustness (deep) classifiers adversarial examples. machine learning classifiers, including deep neural networks, vulnerable adversarial examples. inputs typically generated adding small purposeful modifications lead incorrect outputs imperceptible human eyes. goal paper introduce single method, make theoretical steps towards fully understanding adversarial examples. using concepts topology, theoretical analysis brings forth key reasons adversarial example fool classifier ($f_1$) adds oracle ($f_2$, like human eyes) analysis. investigating topological relationship two (pseudo)metric spaces corresponding predictor $f_1$ oracle $f_2$, develop necessary sufficient conditions determine $f_1$ always robust (strong-robust) adversarial examples according $f_2$. interestingly theorems indicate one unnecessary feature make $f_1$ strong-robust, right feature representation learning key getting classifier accurate strong-robust.",4 "deep learning physical processes: incorporating prior scientific knowledge. consider use deep learning methods modeling complex phenomena like occurring natural physical processes. large amount data gathered phenomena data intensive paradigm could begin challenge traditional approaches elaborated years fields like maths physics. however, despite considerable successes variety application domains, machine learning field yet ready handle level complexity required problems. using example application, namely sea surface temperature prediction, show general background knowledge gained physics could used guideline designing efficient deep learning models. order motivate approach assess generality demonstrate formal link solution class differential equations underlying large family physical phenomena proposed model. experiments comparison series baselines including state art numerical approach provided.",4 production system rules protein complexes genetic regulatory networks. short paper introduces new way design production system rules. indirect encoding scheme presented views rules protein complexes produced temporal behaviour artificial genetic regulatory network. initial study begins using simple boolean regulatory network produce traditional ternary-encoded rules moving fuzzy variant produce real-valued rules. competitive performance shown related genetic regulatory networks rule-based systems benchmark problems.,4 "unsupervised iterative deep learning speech features acoustic tokens applications spoken term detection. paper aim automatically discover high quality frame-level speech features acoustic tokens directly unlabeled speech data. multi-granular acoustic tokenizer (mat) proposed automatic discovery multiple sets acoustic tokens given corpus. acoustic token set specified set hyperparameters describing model configuration. different sets acoustic tokens carry different characteristics given corpus language behind, thus mutually reinforced. multiple sets token labels used targets multi-target deep neural network (mdnn) trained frame-level acoustic features. bottleneck features extracted mdnn used feedback input mat mdnn next iteration. multi-granular acoustic token sets frame-level speech features iteratively optimized iterative deep learning framework. call framework multi-granular acoustic tokenizing deep neural network (matdnn). results evaluated using metrics corpora defined zero resource speech challenge organized interspeech 2015, improved performance obtained set experiments query-by-example spoken term detection corpora. visualization discovered tokens english phonemes also shown.",4 "underwater multi-robot convoying using visual tracking detection. present robust multi-robot convoying approach relies visual detection leading agent, thus enabling target following unstructured 3-d environments. method based idea tracking-by-detection, interleaves efficient model-based object detection temporal filtering image-based bounding box estimation. approach important advantage mitigating tracking drift (i.e. drifting away target object), common symptom model-free trackers detrimental sustained convoying practice. illustrate solution, collected extensive footage underwater robot ocean settings, hand-annotated location frame. based dataset, present empirical comparison multiple tracker variants, including use several convolutional neural networks, without recurrent connections, well frequency-based model-free trackers. also demonstrate practicality tracking-by-detection strategy real-world scenarios successfully controlling legged underwater robot five degrees freedom follow another robot's independent motion.",4 "dual approach scalable verification deep networks. paper addresses problem formally verifying desirable properties neural networks, i.e., obtaining provable guarantees outputs neural network always behave certain way given class inputs. previous work topic limited applicability size network, network architecture complexity properties verified. contrast, framework applies much general class activation functions specifications neural network inputs outputs. formulate verification optimization problem solve lagrangian relaxation optimization problem obtain upper bound verification objective. approach anytime, i.e. stopped time valid bound objective obtained. develop specialized verification algorithms provable tightness guarantees special assumptions demonstrate practical significance general verification approach variety verification tasks.",4 "modeep: deep learning framework using motion features human pose estimation. work, propose novel efficient method articulated human pose estimation videos using convolutional network architecture, incorporates color motion features. propose new human body pose dataset, flic-motion, extends flic dataset additional motion features. apply architecture dataset report significantly better performance current state-of-the-art pose detection systems.",4 "detecting multiword phrases mathematical text corpora. present approach detecting multiword phrases mathematical text corpora. method used based characteristic features mathematical terminology. makes use software tool named lingo allows identify words means previously defined dictionaries specific word classes adjectives, personal names nouns. detection multiword groups done algorithmically. possible advantages method indexing information retrieval conclusions applying dictionary-based methods automatic indexing instead stemming procedures discussed.",4 "uncovering latent style factors expressive speech synthesis. prosodic modeling core problem speech synthesis. key challenge producing desirable prosody textual input containing phonetic information. preliminary study, introduce concept ""style tokens"" tacotron, recently proposed end-to-end neural speech synthesis model. using style tokens, aim extract independent prosodic styles training data. show without annotation data explicit supervision signal, approach automatically learn variety prosodic variations purely data-driven way. importantly, style token corresponds fixed style factor regardless given text sequence. result, control prosodic style synthetic speech somewhat predictable globally consistent way.",4 "multi-document summarization via discriminative summary reranking. existing multi-document summarization systems usually rely specific summarization model (i.e., summarization method specific parameter setting) extract summaries different document sets different topics. however, according quantitative analysis, none existing summarization models always produce high-quality summaries different document sets, even summarization model good overall performance may produce low-quality summaries document sets. contrary, baseline summarization model may produce high-quality summaries document sets. based observations, treat summaries produced different summarization models candidate summaries, explore discriminative reranking techniques identify high-quality summaries candidates difference document sets. propose extract set candidate summaries document set based ilp framework, leverage ranking svm summary reranking. various useful features developed reranking process, including word-level features, sentence-level features summary-level features. evaluation results benchmark duc datasets validate efficacy robustness proposed approach.",4 "understanding physics interconnected data. metal melting release explosion physical system far quilibrium. complete physical model system exist, many interrelated effects considered. general methodology needs developed describe understand physical phenomena involved. high noise data, moving blur images, high degree uncertainty due different types sensors, information entangled hidden inside noisy images makes reasoning physical processes difficult. major problems include proper information extraction problem reconstruction, well prediction missing data. paper, several techniques addressing first problem given, building basis tackling second problem.",4 "robust face recognition via block sparse bayesian learning. face recognition (fr) important task pattern recognition computer vision. sparse representation (sr) demonstrated powerful framework fr. general, sr algorithm treats face training dataset basis function, tries find sparse representation test face basis functions. sparse representation coefficients provide recognition hint. early sr algorithms based basic sparse model. recently, found algorithms based block sparse model achieve better recognition rates. based model, study use block sparse bayesian learning (bsbl) find sparse representation test face recognition. bsbl recently proposed framework, many advantages existing block-sparse-model based algorithms. experimental results extended yale b, ar cmu pie face databases show using bsbl achieve better recognition rates higher robustness state-of-the-art algorithms cases.",4 "helping domain experts build speech translation systems. present new platform, ""regulus lite"", supports rapid development web deployment several types phrasal speech translation systems using minimal formalism. distinguishing feature development work performed directly domain experts. motivate need platforms type discuss three specific cases: medical speech translation, speech-to-sign-language translation voice questionnaires. briefly describe initial experiences developing practical systems.",4 "attention-based guided structured sparsity deep neural networks. network pruning aimed imposing sparsity neural network architecture increasing portion zero-valued weights reducing size regarding energy-efficiency consideration increasing evaluation speed. conducted research efforts, sparsity enforced network pruning without attention internal network characteristics unbalanced outputs neurons specifically distribution weights outputs neurons. may cause severe accuracy drop due uncontrolled sparsity. work, propose attention mechanism simultaneously controls sparsity intensity supervised network pruning keeping important information bottlenecks network active. cifar-10, proposed method outperforms best baseline method 6% reduced accuracy drop 2.6x level sparsity.",4 "generalized linear mixing model accounting endmember variability. endmember variability important factor accurately unveiling vital information relating pure materials distribution hyperspectral images. recently, extended linear mixing model (elmm) proposed modification linear mixing model (lmm) consider endmember variability effects resulting mainly illumination changes. paper, generalize elmm leading new model (glmm) account complex spectral distortions different wavelength intervals affected unevenly. also extend existing methodology jointly estimate variability abundances glmm. simulations real synthetic data show unmixing process benefit extra flexibility introduced glmm.",4 "categorization axioms clustering results. cluster analysis attracted attention field machine learning data mining. numerous clustering algorithms proposed developed due diverse theories various requirements emerging applications. therefore, worth establishing unified axiomatic framework data clustering. literature, open problem proved challenging. paper, clustering results axiomatized assuming proper clustering result satisfy categorization axioms. proposed axioms introduce classification clustering results inequalities clustering results, also consistent prototype theory exemplar theory categorization models cognitive science. moreover, proposed axioms lead three principles designing clustering algorithm cluster validity index, follow many popular clustering algorithms cluster validity indices.",4 "multi-channel weighted nuclear norm minimization real color image denoising. existing denoising algorithms developed grayscale images, trivial work extend color image denoising noise statistics r, g, b channels different real noisy images. paper, propose multi-channel (mc) optimization model real color image denoising weighted nuclear norm minimization (wnnm) framework. concatenate rgb patches make use channel redundancy, introduce weight matrix balance data fidelity three channels consideration different noise statistics. proposed mc-wnnm model analytical solution. reformulate linear equality-constrained problem solve alternating direction method multipliers. alternative updating step closed-form solution convergence guaranteed. extensive experiments synthetic real noisy image datasets demonstrate superiority proposed mc-wnnm state-of-the-art denoising methods.",4 "acquisition visual features probabilistic spike-timing-dependent plasticity. final version paper published ieeexplore available http://ieeexplore.ieee.org/document/7727213. please cite paper as: amirhossein tavanaei, timothee masquelier, anthony maida, acquisition visual features probabilistic spike-timing-dependent plasticity. ieee international joint conference neural networks. pp. 307-314, ijcnn 2016. paper explores modifications feedforward five-layer spiking convolutional network (scn) ventral visual stream [masquelier, t., thorpe, s., unsupervised learning visual features spike timing dependent plasticity. plos computational biology, 3(2), 247-257]. original model showed spike-timing-dependent plasticity (stdp) learning algorithm embedded appropriately selected scn could perform unsupervised feature discovery. discovered features interpretable could effectively used perform rapid binary decisions classifier. order study robustness previous results, present research examines effects modifying components original model. improved biological realism, replace original non-leaky integrate-and-fire neurons izhikevich-like neurons. also replace original stdp rule novel rule probabilistic interpretation. probabilistic stdp slightly significantly improves performance types model neurons. use izhikevich-like neuron found improve performance although performance still comparable neuron. shows model robust enough handle biologically realistic neurons. also conclude underlying reasons stable performance model preserved despite overt changes explicit components model.",4 "equations states singular statistical estimation. learning machines hierarchical structures hidden variables singular statistical models nonidentifiable fisher information matrices singular. singular statistical models, neither bayes posteriori distribution converges normal distribution maximum likelihood estimator satisfies asymptotic normality. main reason difficult predict generalization performances trained states. paper, study four errors, (1) bayes generalization error, (2) bayes training error, (3) gibbs generalization error, (4) gibbs training error, prove mathematical relations among errors. formulas proved paper equations states statistical estimation hold true distribution, parametric model, priori distribution. also show bayes gibbs generalization errors estimated bayes gibbs training errors, propose widely applicable information criteria applied regular singular statistical models.",4 "parallel multi channel convolution using general matrix multiplication. convolutional neural networks (cnns) emerged one successful machine learning technologies image video processing. computationally intensive parts cnns convolutional layers, convolve multi-channel images multiple kernels. common approach implementing convolutional layers expand image column matrix (im2col) perform multiple channel multiple kernel (mcmk) convolution using existing parallel general matrix multiplication (gemm) library. im2col conversion greatly increases memory footprint input matrix reduces data locality. paper propose new approach mcmk convolution based general matrix multiplication (gemm), im2col. algorithm eliminates need data replication input thereby enabling us apply convolution kernels input images directly. implemented several variants algorithm cpu processor embedded arm processor. cpu, algorithm faster im2col cases.",4 "machine comprehension based learning rank. machine comprehension plays essential role nlp widely explored dataset like mctest. however, dataset simple small learning true reasoning abilities. \cite{hermann2015teaching} therefore release large scale news article dataset propose deep lstm reader system machine comprehension. however, training process expensive. therefore try feature-engineered approach semantics new dataset see traditional machine learning technique semantics help machine comprehension. meanwhile, proposed l2r reader system achieves good performance efficiency less training data.",4 effective sparse representation x-ray medical images. effective sparse representation x-ray medical images within context data reduction considered. proposed framework shown render enormous reduction cardinality data set required represent class images good quality. particularity approach implemented competitive processing time low memory requirements,4 "deepvisage: making face recognition simple yet powerful generalization skills. face recognition (fr) methods report significant performance adopting convolutional neural network (cnn) based learning methods. although cnns mostly trained optimizing softmax loss, recent trend shows improvement accuracy different strategies, task-specific cnn learning different loss functions, fine-tuning target dataset, metric learning concatenating features multiple cnns. incorporating tasks obviously requires additional efforts. moreover, demotivates discovery efficient cnn models fr trained identity labels. focus fact propose easily trainable single cnn based fr method. cnn model exploits residual learning framework. additionally, uses normalized features compute loss. extensive experiments show excellent generalization different datasets. obtain competitive state-of-the-art results lfw, ijb-a, youtube faces cacd datasets.",4 "tomographic reconstruction using global statistical prior. recent research tomographic reconstruction motivated need efficiently recover detailed anatomy limited measurements. one ways compensate increasingly sparse sets measurements exploit information templates, i.e., prior data available form already reconstructed, structurally similar images. towards this, previous work exploited using set global patch based dictionary priors. paper, propose global prior improve speed quality tomographic reconstruction within compressive sensing framework. choose set potential representative 2d images referred templates, build eigenspace; subsequently used guide iterative reconstruction similar slice sparse acquisition data. experiments across diverse range datasets show reconstruction using appropriate global prior, apart faster, gives much lower reconstruction error compared state art.",4 "happy travelers take big pictures: psychological study machine learning big data. psychology, theory-driven researches usually conducted extensive laboratory experiments, yet rarely tested disproved big data. paper, make use 418k travel photos traveler ratings test influential ""broaden-and-build"" theory, suggests positive emotions broaden one's visual attention. core hypothesis examined study positive emotion associated wider attention, hence highly-rated sites would trigger wide-angle photographs. analyzing travel photos, find strong correlation preference wide-angle photos high rating tourist sites tripadvisor. able carry analysis use deep learning algorithms classify photos wide narrow angles, present study exemplar big data deep learning used test laboratory findings wild.",4 "speeding sat solver exploring cnf symmetries : revisited. boolean satisfiability solvers gone dramatic improvements performances scalability last years considering symmetries. shown using graph symmetries generating symmetry breaking predicates (sbps) possible break symmetries conjunctive normal form (cnf). sbps cut search space nonsymmetric regions space without affecting satisfiability cnf formula. symmetry breaking predicates created representing formula graph, finding graph symmetries using symmetry extraction mechanism (crawford et al.). paper take one non-trivial cnf explore symmetries. finally, generate sbps adding cnf show helps prune search tree, sat solver would take short time. present pruning procedure search tree scratch, starting cnf graph representation. explore whole mechanism non-trivial example, would easily comprehendible. also given new idea generating symmetry breaking predicates breaking symmetry cnf, derived crawford's conditions. last propose backtrack sat solver inbuilt sbp generator.",12 "deep generative filter motion deblurring. removing blur caused camera shake images always challenging problem computer vision literature due ill-posed nature. motion blur caused due relative motion camera object 3d space induces spatially varying blurring effect entire image. paper, propose novel deep filter based generative adversarial network (gan) architecture integrated global skip connection dense architecture order tackle problem. model, bypassing process blur kernel estimation, significantly reduces test time necessary practical applications. experiments benchmark datasets prove effectiveness proposed method outperforms state-of-the-art blind deblurring algorithms quantitatively qualitatively.",4 "cost-based feature transfer vehicle occupant classification. knowledge human presence interaction vehicle growing interest vehicle manufacturers design safety purposes. present framework perform tasks occupant detection occupant classification automatic child locks airbag suppression. operates passenger seats, using single overhead camera. transfer learning technique introduced make full use training data seats whilst still maintaining control bias, necessary system designed penalize certain misclassifications others. evaluation performed challenging dataset weighted unweighted classifiers, demonstrating effectiveness transfer process.",4 "near optimal behavior via approximate state abstraction. combinatorial explosion plagues planning reinforcement learning (rl) algorithms moderated using state abstraction. prohibitively large task representations condensed essential information preserved, consequently, solutions tractably computable. however, exact abstractions, treat fully-identical situations equivalent, fail present opportunities abstraction environments two situations exactly alike. work, investigate approximate state abstractions, treat nearly-identical situations equivalent. present theoretical guarantees quality behaviors derived four types approximate abstractions. additionally, empirically demonstrate approximate abstractions lead reduction task complexity bounded loss optimality behavior variety environments.",4 "mining generalized graph patterns based user examples. lot recent interest mining patterns graphs. often, exact structure patterns interest known. happens, example, molecular structures mined discover fragments useful features chemical compound classification task, web sites mined discover sets web pages representing logical documents. patterns often generated small subgraphs (cores), according certain generalization rules (grs). call patterns ""generalized patterns""(gps). structurally different, gps often perform function network. previously proposed approaches mining gps either assumed cores grs given, interesting gps frequent. strong assumptions, often hold practical applications. paper, propose approach mining gps free assumptions. given small number gps selected user, algorithm discovers gps similar user examples. first, machine learning-style approach used find cores. second, generalizations cores graph computed identify gps. evaluation synthetic data, generated using real cores grs biological web domains, demonstrates effectiveness approach.",4 "high dimensional semiparametric gaussian copula graphical models. paper, propose semiparametric approach, named nonparanormal skeptic, efficiently robustly estimating high dimensional undirected graphical models. achieve modeling flexibility, consider gaussian copula graphical models (or nonparanormal) proposed liu et al. (2009). achieve estimation robustness, exploit nonparametric rank-based correlation coefficient estimators, including spearman's rho kendall's tau. high dimensional settings, prove nonparanormal skeptic achieves optimal parametric rate convergence graph parameter estimation. celebrating result suggests gaussian copula graphical models used safe replacement popular gaussian graphical models, even data truly gaussian. besides theoretical analysis, also conduct thorough numerical simulations compare different estimators graph recovery performance ideal noisy settings. proposed methods applied large-scale genomic dataset illustrate empirical usefulness. r language software package huge implementing proposed methods available comprehensive r archive network: http://cran. r-project.org/.",19 "surrogate regret bounds bipartite ranking via strongly proper losses. problem bipartite ranking, instances labeled positive negative goal learn scoring function minimizes probability mis-ranking pair positive negative instances (or equivalently, maximizes area roc curve), widely studied recent years. dominant theoretical algorithmic framework problem reduce bipartite ranking pairwise classification; particular, well known bipartite ranking regret formulated pairwise classification regret, turn upper bounded using usual regret bounds classification problems. recently, kotlowski et al. (2011) showed regret bounds bipartite ranking terms regret associated balanced versions standard (non-pairwise) logistic exponential losses. paper, show (non-pairwise) surrogate regret bounds bipartite ranking obtained terms broad class proper (composite) losses term strongly proper. proof technique much simpler kotlowski et al. (2011), relies properties proper (composite) losses elucidated recently reid williamson (2010, 2011) others. result yields explicit surrogate bounds (with hidden balancing terms) terms variety strongly proper losses, including example logistic, exponential, squared squared hinge losses special cases. also obtain tighter surrogate bounds certain low-noise conditions via recent result clemencon robbiano (2011).",4 "discriminative optimization: theory applications computer vision problems. many computer vision problems formulated optimization cost function. approach faces two main challenges: (i) designing cost function local optimum acceptable solution, (ii) developing efficient numerical method search one (or multiple) local optima. designing functions feasible noiseless case, stability location local optima mostly unknown noise, occlusion, missing data. practice, result undesirable local optima local optimum expected place. hand, numerical optimization algorithms high-dimensional spaces typically local often rely expensive first second order information guide search. overcome limitations, paper proposes discriminative optimization (do), method learns search directions data without need cost function. specifically, explicitly learns sequence updates search space leads stationary points correspond desired solutions. provide formal analysis illustrate benefits problem 3d point cloud registration, camera pose estimation, image denoising. show performed comparably outperformed state-of-the-art algorithms terms accuracy, robustness perturbations, computational efficiency.",4 "statistical learning arbitrary computable classifiers. statistical learning theory chiefly studies restricted hypothesis classes, particularly finite vapnik-chervonenkis (vc) dimension. fundamental quantity interest sample complexity: number samples required learn specified level accuracy. consider learning set computable labeling functions. since vc-dimension infinite priori (uniform) bounds number samples impossible, let learning algorithm decide seen sufficient samples learned. first show learning setting indeed possible, develop learning algorithm. show, however, bounding sample complexity independently distribution impossible. notably, impossibility entirely due requirement learning algorithm computable, due statistical nature problem.",4 "discrete dynamical genetic programming xcs. number representation schemes presented use within learning classifier systems, ranging binary encodings neural networks. paper presents results investigation using discrete dynamical system representation within xcs learning classifier system. particular, asynchronous random boolean networks used represent traditional condition-action production system rules. shown possible use self-adaptive, open-ended evolution design ensemble discrete dynamical systems within xcs solve number well-known test problems.",4 "coarse fine non-rigid registration: chain scale-specific neural networks multimodal image alignment application remote sensing. tackle problem multimodal image non-rigid registration, prime importance remote sensing medical imaging. difficulties encountered classical registration approaches include feature design slow optimization gradient descent. analyzing methods, note significance notion scale. design easy-to-train, fully-convolutional neural networks able learn scale-specific features. chained appropriately, perform global registration linear time, getting rid gradient descent schemes predicting directly deformation.we show performance terms quality speed various tasks remote sensing multimodal image alignment. particular, able register correctly cadastral maps buildings well road polylines onto rgb images, outperform current keypoint matching methods.",4 "multigrid rough coefficients multiresolution operator decomposition hierarchical information games. introduce near-linear complexity (geometric meshless/algebraic) multigrid/multiresolution method pdes rough ($l^\infty$) coefficients rigorous a-priori accuracy performance estimates. method discovered decision/game theory formulation problems (1) identifying restriction interpolation operators (2) recovering signal incomplete measurements based norm constraints image linear operator (3) gambling value solution pde based hierarchy nested measurements solution source term. resulting elementary gambles form hierarchy (deterministic) basis functions $h^1_0(\omega)$ (gamblets) (1) orthogonal across subscales/subbands respect scalar product induced energy norm pde (2) enable sparse compression solution space $h^1_0(\omega)$ (3) induce orthogonal multiresolution operator decomposition. operating diagram multigrid method inverted pyramid gamblets computed locally (by virtue exponential decay), hierarchically (from fine coarse scales) pde decomposed hierarchy independent linear systems uniformly bounded condition numbers. resulting algorithm parallelizable space (via localization) bandwith/subscale (subscales computed independently other). although method deterministic natural bayesian interpretation measure probability emerging (as mixed strategy) information game formulation multiresolution approximations form martingale respect filtration induced hierarchy nested measurements.",12 "inductive sparse subspace clustering. sparse subspace clustering (ssc) achieved state-of-the-art clustering quality performing spectral clustering $\ell^{1}$-norm based similarity graph. however, ssc transductive method handle data used construct graph (out-of-sample data). new datum, ssc requires solving $n$ optimization problems o(n) variables performing algorithm whole data set, $n$ number data points. therefore, inefficient apply ssc fast online clustering scalable graphing. letter, propose inductive spectral clustering algorithm, called inductive sparse subspace clustering (issc), makes ssc feasible cluster out-of-sample data. issc adopts assumption high-dimensional data actually lie low-dimensional manifold out-of-sample data could grouped embedding space learned in-sample data. experimental results show issc promising clustering out-of-sample data.",4 "multilevel context representation improving object recognition. work, propose combined usage low- high-level blocks convolutional neural networks (cnns) improving object recognition. recent research focused either propagating context layers, e.g. resnet, (including low-level layers) multiple loss layers (e.g. googlenet), importance features close higher layers ignored. paper postulates use context closer high-level layers provides scale translation invariance works better using top layer only. particular, extend alexnet googlenet additional connections top $n$ layers. order demonstrate effectiveness proposed approach, evaluated standard imagenet task. relative reduction classification error around 1-2% without affecting computational cost. furthermore, show approach orthogonal typical test data augmentation techniques, recently introduced szegedy et al. (leading runtime reduction 144 test time).",4 "fostering user engagement: rhetorical devices applause generation learnt ted talks. one problem every presenter faces delivering public discourse hold listeners' attentions keep involved. therefore, many studies conversation analysis work issue suggest qualitatively con-structions effectively lead audience's applause. investigate proposals quantitatively, study an-alyze transcripts 2,135 ted talks, particular fo-cus rhetorical devices used presenters applause elicitation. conducting regression anal-ysis, identify interpret 24 rhetorical devices triggers audience applauding. build models rec-ognize applause-evoking sentences conclude work potential implications.",4 "new approach translation isolated units english-korean machine translation. effective way quick translation tremendous amount explosively increasing science technique information material develop practicable machine translation system introduce translation practice. essay treats problems arising translation isolated units basis practical materials experiments obtained development introduction english-korean machine translation system. words, essay considers establishment information isolated units korean equivalents word order.",4 "multiplicative algorithm orthgonal groups independent component analysis. multiplicative newton-like method developed author et al. extended situation dynamics restricted orthogonal group. general framework constructed without specifying cost function. though restriction orthogonal groups makes problem somewhat complicated, explicit expression amount individual jumps obtained. algorithm exactly second-order-convergent. global instability inherent newton method remedied levenberg-marquardt-type variation. method thus constructed readily applied independent component analysis. remarkable performance illustrated numerical simulation.",4 "resource allocation using metaheuristic search. research focused solving problems area software project management using metaheuristic search algorithms research field search based software engineering. main aim research evaluate performance different metaheuristic search techniques resource allocation scheduling problems would typical software development projects. paper reports set experiments evaluate performance three algorithms, namely simulated annealing, tabu search genetic algorithms. experimental results indicate metaheuristics search techniques used solve problems resource allocation scheduling within software project. finally, comparative analysis suggests overall genetic algorithm performed better simulated annealing tabu search.",4 "neural probabilistic model non-projective mst parsing. paper, propose probabilistic parsing model, defines proper conditional probability distribution non-projective dependency trees given sentence, using neural representations inputs. neural network architecture based bi-directional lstm-cnns benefits word- character-level representations automatically, using combination bidirectional lstm cnn. top neural network, introduce probabilistic structured layer, defining conditional log-linear model non-projective trees. evaluate model 17 different datasets, across 14 different languages. exploiting kirchhoff's matrix-tree theorem (tutte, 1984), partition functions marginals computed efficiently, leading straight-forward end-to-end model training procedure via back-propagation. parser achieves state-of-the-art parsing performance nine datasets.",4 "memristor crossbar-based hardware implementation ids method. ink drop spread (ids) engine active learning method (alm), methodology soft computing. ids, pattern-based processing unit, extracts useful information system subjected modeling. spite excellent potential solving problems classification modeling compared soft computing tools, finding simple fast hardware implementation still challenge. paper describes new hardware implementation ids method based memristor crossbar structure. addition simplicity, completely real-time, low latency ability continue working occurrence power breakdown advantages proposed circuit.",4 "associative long short-term memory. investigate new method augment recurrent neural networks extra memory without increasing number network parameters. system associative memory based complex-valued vectors closely related holographic reduced representations long short-term memory networks. holographic reduced representations limited capacity: store information, retrieval becomes noisier due interference. system contrast creates redundant copies stored information, enables retrieval reduced noise. experiments demonstrate faster learning multiple memorization tasks.",4 "understanding version multivariate symmetric uncertainty assist feature selection. paper, analyze behavior multivariate symmetric uncertainty (msu) measure use statistical simulation techniques various mixes informative non-informative randomly generated features. experiments show number attributes, cardinalities, sample size affect msu. discovered condition preserves good quality msu different combinations three factors, providing new useful criterion help drive process dimension reduction.",4 "combinatorial approach object analysis. present perceptional mathematical model image signal analysis. resemblance measure defined, submitted innovating combinatorial optimization algorithm. numerical simulations also presented",13 "analog simulator integro-differential equations classical memristors. analog computer makes use continuously changeable quantities system, electrical, mechanical, hydraulic properties, solve given problem. devices usually computationally powerful digital counterparts, suffer analog noise allow error control. focus analog computers based active electrical networks comprised resistors, capacitors, operational amplifiers capable simulating linear ordinary differential equation. however, class nonlinear dynamics solve limited. work, adding memristors electrical network, show analog computer simulate large variety linear nonlinear integro-differential equations carefully choosing conductance dynamics memristor state variable. study performance analog computers simulating integro-differential models fluid dynamics type, nonlinear volterra equations population growth, quantum models describing non-markovian memory effects, among others. finally, perform stability tests considering imperfect analog components, obtaining robust solutions $13\%$ relative error relevant timescales.",4 "whiteout: gaussian adaptive noise regularization feedforward neural networks. noise injection (ni) approach mitigate over-fitting feedforward neural networks (nns). bernoulli ni procedure implemented dropout shakeout connections $l_1$ $l_2$ regularization nn model parameters demonstrates efficiency feasibility ni regularizing nns. propose whiteout, new ni regularization technique adaptive gaussian noise nns. whiteout versatile dropout shakeout. show optimization objective function associated whiteout generalized linear models closed-form penalty term connections wide range regularization includes bridge, lasso, ridge, elastic net penalization special cases; also extended offer regularization similar adaptive lasso group lasso. prove whiteout also viewed robust learning nns presence small perturbations input hidden nodes. establish noise-perturbed empirical loss function whiteout converges almost surely ideal loss function, estimates nn parameters obtained minimizing former loss function consistent obtained minimizing ideal loss function. computationally, whiteout easily incorporated back-propagation algorithm. superiority whiteout dropout shakeout learning nns relatively small sized training data demonstrated using lsvt voice rehabilitation data libras hand movement data.",19 "theoretical insights optimization landscape over-parameterized shallow neural networks. paper study problem learning shallow artificial neural network best fits training data set. study problem over-parameterized regime number observations fewer number parameters model. show quadratic activations optimization landscape training shallow neural networks certain favorable characteristics allow globally optimal models found efficiently using variety local search heuristics. result holds arbitrary training data input/output pairs. differentiable activation functions also show gradient descent, suitably initialized, converges linear rate globally optimal model. result focuses realizable model inputs chosen i.i.d. gaussian distribution labels generated according planted weight coefficients.",4 "pursuit temporal accuracy general activity detection. detecting activities untrimmed videos important challenging task. performance existing methods remains unsatisfactory, e.g., often meet difficulties locating beginning end long complex action. paper, propose generic framework accurately detect wide variety activities untrimmed videos. first contribution novel proposal scheme efficiently generate candidates accurate temporal boundaries. contribution cascaded classification pipeline explicitly distinguishes relevance completeness candidate instance. two challenging temporal activity detection datasets, thumos14 activitynet, proposed framework significantly outperforms existing state-of-the-art methods, demonstrating superior accuracy strong adaptivity handling activities various temporal structures.",4 "accurate image super-resolution using deep convolutional networks. present highly accurate single-image super-resolution (sr) method. method uses deep convolutional network inspired vgg-net used imagenet classification \cite{simonyan2015very}. find increasing network depth shows significant improvement accuracy. final model uses 20 weight layers. cascading small filters many times deep network structure, contextual information large image regions exploited efficient way. deep networks, however, convergence speed becomes critical issue training. propose simple yet effective training procedure. learn residuals use extremely high learning rates ($10^4$ times higher srcnn \cite{dong2015image}) enabled adjustable gradient clipping. proposed method performs better existing methods accuracy visual improvements results easily noticeable.",4 "learn slow self-avoiding adaptive walks infinite radius search algorithm?. slow self-avoiding adaptive walks infinite radius search algorithm (limax) analyzed themselves, network form. study conducted several nk problems two hiff problems. find examination ""slacker"" walks networks indicate relative search difficulty within family problems, help identify potential local optima, detect presence structure fitness landscapes. hierarchical walks used differentiate rugged landscapes hierarchical (e.g. hiff) anarchic (e.g. nk). notion node viscidity measure local optimum potential introduced found quite successful although work needs done improve accuracy problems larger k.",4 "similarity-based estimation word cooccurrence probabilities. many applications natural language processing necessary determine likelihood given word combination. example, speech recognizer may need determine two word combinations ``eat peach'' ``eat beach'' likely. statistical nlp methods determine likelihood word combination according frequency training corpus. however, nature language many word combinations infrequent occur given corpus. work propose method estimating probability previously unseen word combinations using available information ``most similar'' words. describe probabilistic word association model based distributional word similarity, apply improving probability estimates unseen word bigrams variant katz's back-off model. similarity-based method yields 20% perplexity improvement prediction unseen bigrams statistically significant reductions speech-recognition error.",2 "unbiased data collection content exploitation/exploration strategy personalization. one missions personalization systems recommender systems show content items according users' personal interests. order achieve goal, systems learning user interests time trying present content items tailoring user profiles. recommending items according users' preferences investigated extensively past years, mainly thanks popularity netflix competition. real setting, users may attracted subset items interact them, leaving partial feedbacks system learn next cycle, leads significant biases systems hence results situation user engagement metrics cannot improved time. problem one component system. data collected users usually used many different tasks, including learning ranking functions, building user profiles constructing content classifiers. data biased, downstream use cases would impacted well. therefore, would beneficial gather unbiased data user interactions. traditionally, unbiased data collection done showing items uniformly sampling content pool. however, simple scheme feasible risks user engagement metrics takes long time gather user feedbacks. paper, introduce user-friendly unbiased data collection framework, utilizing methods developed exploitation exploration literature. discuss framework different normal multi-armed bandit problems method needed. layout novel thompson sampling bernoulli ranked-list effectively balance user experiences data collection. proposed method validated real bucket test show strong results comparing old algorithms",4 "estimation tissue microstructure using deep network inspired sparse reconstruction framework. diffusion magnetic resonance imaging (dmri) provides unique tool noninvasively probing microstructure neuronal tissue. noddi model popular approach estimation tissue microstructure many neuroscience studies. represents diffusion signals three types diffusion tissue: intra-cellular, extra-cellular, cerebrospinal fluid compartments. however, original noddi method uses computationally expensive procedure fit model could require large number diffusion gradients accurate microstructure estimation, may impractical clinical use. therefore, efforts devoted efficient accurate noddi microstructure estimation reduced number diffusion gradients. work, propose deep network based approach noddi microstructure estimation, named microstructure estimation using deep network (medn). motivated amico algorithm accelerates computation noddi parameters, formulate microstructure estimation problem dictionary-based framework. proposed network comprises two cascaded stages. first stage resembles solution dictionary-based sparse reconstruction problem second stage computes final microstructure using output first stage. weights two stages jointly learned training data, obtained training dmri scans diffusion gradients densely sample q-space. proposed method applied brain dmri scans, two shells 30 gradient directions (60 diffusion gradients total) used. estimation accuracy respect gold standard measured results demonstrate medn outperforms competing algorithms.",4 "new perspective boosting linear regression via subgradient optimization relatives. paper analyze boosting algorithms linear regression new perspective: modern first-order methods convex optimization. show classic boosting algorithms linear regression, namely incremental forward stagewise algorithm (fs$_\varepsilon$) least squares boosting (ls-boost($\varepsilon$)), viewed subgradient descent minimize loss function defined maximum absolute correlation features residuals. also propose modification fs$_\varepsilon$ yields algorithm lasso, may easily extended algorithm computes lasso path different values regularization parameter. furthermore, show new algorithms lasso may also interpreted master algorithm (subgradient descent), applied regularized version maximum absolute correlation loss function. derive novel, comprehensive computational guarantees several boosting algorithms linear regression (including ls-boost($\varepsilon$) fs$_\varepsilon$) using techniques modern first-order methods convex optimization. computational guarantees inform us statistical properties boosting algorithms. particular provide, first time, precise theoretical description amount data-fidelity regularization imparted running boosting algorithm prespecified learning rate fixed arbitrary number iterations, dataset.",12 "deep reinforcement learning time series: playing idealized trading games. deep q-learning investigated end-to-end solution estimate optimal strategies acting time series input. experiments conducted two idealized trading games. 1) univariate: input wave-like price time series, 2) bivariate: input includes random stepwise price time series noisy signal time series, positively correlated future price changes. univariate game tests whether agent capture underlying dynamics, bivariate game tests whether agent utilize hidden relation among inputs. stacked gated recurrent unit (gru), long short-term memory (lstm) units, convolutional neural network (cnn), multi-layer perceptron (mlp) used model q values. games, agents successfully find profitable strategy. gru-based agents show best overall performance univariate game, mlp-based agents outperform others bivariate game.",4 "deep structured model radius-margin bound 3d human activity recognition. understanding human activity challenging even recently developed 3d/depth sensors. solve problem, work investigates novel deep structured model, adaptively decomposes activity instance temporal parts using convolutional neural networks (cnns). model advances traditional deep learning approaches two aspects. first, { incorporate latent temporal structure deep model, accounting large temporal variations diverse human activities. particular, utilize latent variables decompose input activity number temporally segmented sub-activities, accordingly feed parts (i.e. sub-networks) deep architecture}. second, incorporate radius-margin bound regularization term deep model, effectively improves generalization performance classification. model training, propose principled learning algorithm iteratively (i) discovers optimal latent variables (i.e. ways activity decomposition) training instances, (ii) { updates classifiers} based generated features, (iii) updates parameters multi-layer neural networks. experiments, approach validated several complex scenarios human activity recognition demonstrates superior performances state-of-the-art approaches.",4 "spectral learning dynamic systems nonequilibrium data. observable operator models (ooms) related models one important powerful tools modeling analyzing stochastic systems. exactly describe dynamics finite-rank systems efficiently consistently estimated spectral learning assumption identically distributed data. paper, investigate properties spectral learning without assumption due requirements analyzing large-time scale systems, show equilibrium dynamics system extracted nonequilibrium observation data imposing equilibrium constraint. addition, propose binless extension spectral learning continuous data. comparison continuous-valued spectral algorithms, binless algorithm achieve consistent estimation equilibrium dynamics linear complexity.",4 "information content versus word length random typing. recently, claimed linear relationship measure information content word length expected word length optimization shown linearity supported strong correlation information content word length many languages (piantadosi et al. 2011, pnas 108, 3825-3826). here, study detail connections measure standard information theory. relationship measure word length studied popular random typing process text constructed pressing keys random keyboard containing letters space behaving word delimiter. although random process optimize word lengths according information content, exhibits linear relationship information content word length. exact slope intercept presented three major variants random typing process. strong correlation information content word length simply arise units making word (e.g., letters) necessarily interplay word context proposed piantadosi et al. itself, linear relation entail results optimization process.",15 "spatial modeling oil exploration areas using neural networks anfis gis. exploration hydrocarbon resources highly complicated expensive process various geological, geochemical geophysical factors developed combined together. highly significant design seismic data acquisition survey locate exploratory wells since incorrect imprecise locations lead waste time money operation. objective study locate high-potential oil gas field 1: 250,000 sheet ahwaz including 20 oil fields reduce time costs exploration production processes. regard, 17 maps developed using gis functions factors including: minimum maximum total organic carbon (toc), yield potential hydrocarbons production (pp), tmax peak, production index (pi), oxygen index (oi), hydrogen index (hi) well presence proximity high residual bouguer gravity anomalies, proximity anticline axis faults, topography curvature maps obtained asmari formation subsurface contours. model integrate maps, study employed artificial neural network adaptive neuro-fuzzy inference system (anfis) methods. results obtained model validation demonstrated 17x10x5 neural network r=0.8948, rms=0.0267, kappa=0.9079 trained better models anfis predicts potential areas accurately. however, method failed predict oil fields wrongly predict areas potential zones.",19 "neural motifs: scene graph parsing global context. investigate problem producing structured graph representations visual scenes. work analyzes role motifs: regularly appearing substructures scene graphs. present new quantitative insights repeated structures visual genome dataset. analysis shows object labels highly predictive relation labels vice-versa. also find recurring patterns even larger subgraphs: 50% graphs contain motifs involving least two relations. analysis leads new baseline simple, yet strikingly powerful. hardly considering overall visual context image, outperforms previous approaches. introduce stacked motif networks, new architecture encoding global context crucial capturing higher order motifs scene graphs. best model scene graph detection achieves 7.3% absolute improvement recall@50 (41% relative gain) prior state-of-the-art.",4 "active detection localization textureless objects cluttered environments. paper introduces active object detection localization framework combines robust untextured object detection 3d pose estimation algorithm novel next-best-view selection strategy. address detection localization problems proposing edge-based registration algorithm refines object position minimizing cost directly extracted 3d image tensor encodes minimum distance edge point joint direction/location space. face next-best-view problem exploiting sequential decision process that, step, selects next camera position maximizes mutual information state next observations. solve intrinsic intractability solution generating observations represent scene realizations, i.e. combination samples object hypothesis provided object detector, modeling state means set constantly resampled particles. experiments performed different real world, challenging datasets confirm effectiveness proposed methods.",4 "possibilistic model qualitative sequential decision problems uncertainty partially observable environments. article propose qualitative (ordinal) counterpart partially observable markov decision processes model (pomdp) uncertainty, well preferences agent, modeled possibility distributions. qualitative counterpart pomdp model relies possibilistic theory decision uncertainty, recently developed. one advantage qualitative framework ability escape classical obstacle stochastic pomdps, even finite state space, obtained belief state space pomdp infinite. instead, possibilistic framework even exponentially larger state space, belief state space remains finite.",4 "notes electronic lexicography. notes continuation topics covered v. selegej article ""electronic dictionaries computational lexicography"". electronic dictionary object description closely related languages? obviously, question allows multiple answers.",4 "bidirectional long-short term memory video description. video captioning attracting broad research attention multimedia community. however, existing approaches either ignore temporal information among video frames employ local contextual temporal knowledge. work, propose novel video captioning framework, termed \emph{bidirectional long-short term memory} (bilstm), deeply captures bidirectional global temporal structure video. specifically, first devise joint visual modelling approach encode video data combining forward lstm pass, backward lstm pass, together visual features convolutional neural networks (cnns). then, inject derived video representation subsequent language model initialization. benefits two folds: 1) comprehensively preserving sequential visual information; 2) adaptively learning dense visual features sparse semantic representations videos sentences, respectively. verify effectiveness proposed video captioning framework commonly-used benchmark, i.e., microsoft video description (msvd) corpus, experimental results demonstrate superiority proposed approach compared several state-of-the-art methods.",4 "analysis visualisation rdf resources ondex. ondex data integration visualization platform developed support systems biology research. core data model based two main principles: first, information represented graph and, second, elements graph annotated ontologies. data model conformant semantic web framework, particular rdf, therefore ondex ideally positioned platform exploit semantic web.",4 "variational depth focus reconstruction. paper deals problem reconstructing depth map sequence differently focused images, also known depth focus shape focus. propose state depth focus problem variational problem including smooth nonconvex data fidelity term, convex nonsmooth regularization, makes method robust noise leads realistic depth maps. additionally, propose solve nonconvex minimization problem linearized alternating directions method multipliers (admm), allowing minimize energy efficiently. numerical comparison classical methods simulated well real data presented.",4 "using robdds inference bayesian networks troubleshooting example. using bayesian networks modelling behavior man-made machinery, usually happens large part model deterministic. bayesian networks deterministic part model represented boolean function, central part belief updating reduces task calculating number satisfying configurations boolean function. paper explore advances calculation boolean functions adopted belief updating, particular within context troubleshooting. present experimental results indicating substantial speed-up compared traditional junction tree propagation.",4 "beyond pixels regions: non local patch means (nlpm) method content-level restoration, enhancement, reconstruction degraded document images. patch-based non-local restoration reconstruction method preprocessing degraded document images introduced. method collects relative data whole input image, image data first represented content-level descriptor based patches. patch-equivalent representation input image corrected based similar patches identified using modified genetic algorithm (ga) resulting low computational load. corrected patch-equivalent converted output restored image. fact method uses patches content level allows incorporate high-level restoration objective self-sufficient way. method applied several degraded document images, including dibco'09 contest dataset promising results.",4 "sdna: stochastic dual newton ascent empirical risk minimization. propose new algorithm minimizing regularized empirical loss: stochastic dual newton ascent (sdna). method dual nature: iteration update random subset dual variables. however, unlike existing methods stochastic dual coordinate ascent, sdna capable utilizing curvature information contained examples, leads striking improvements theory practice - sometimes orders magnitude. special case l2-regularizer used primal, dual problem concave quadratic maximization problem plus separable term. regime, sdna step solves proximal subproblem involving random principal submatrix hessian quadratic function; whence name method. if, addition, loss functions quadratic, method interpreted novel variant recently introduced iterative hessian sketch.",4 "stochastic reformulations linear systems: algorithms convergence theory. develop family reformulations arbitrary consistent linear system stochastic problem. reformulations governed two user-defined parameters: positive definite matrix defining norm, arbitrary discrete continuous distribution random matrices. reformulation several equivalent interpretations, allowing researchers various communities leverage domain specific insights. particular, reformulation equivalently seen stochastic optimization problem, stochastic linear system, stochastic fixed point problem probabilistic intersection problem. prove sufficient, necessary sufficient conditions reformulation exact. further, propose analyze three stochastic algorithms solving reformulated problem---basic, parallel accelerated methods---with global linear convergence rates. rates interpreted condition numbers matrix depends system matrix reformulation parameters. gives rise new phenomenon call stochastic preconditioning, refers problem finding parameters (matrix distribution) leading sufficiently small condition number. basic method equivalently interpreted stochastic gradient descent, stochastic newton method, stochastic proximal point method, stochastic fixed point method, stochastic projection method, fixed stepsize (relaxation parameter), applied reformulations.",12 "benefits output sparsity multi-label classification. multi-label classification framework, observation associated set labels, generated tremendous amount attention recent years. modern multi-label problems typically large-scale terms number observations, features labels, amount labels even comparable amount observations. context, different remedies proposed overcome curse dimensionality. work, aim exploiting output sparsity introducing new loss, called sparse weighted hamming loss. proposed loss seen weighted version classical ones, active inactive labels weighted separately. leveraging influence sparsity loss function, provide improved generalization bounds empirical risk minimizer, suitable property large-scale problems. new loss, derive rates convergence linear underlying output-sparsity rather linear number labels. practice, minimizing associated risk performed efficiently using convex surrogates modern convex optimization algorithms. provide experiments various real-world datasets demonstrating pertinence approach compared non-weighted techniques.",12 "scale-invariance ruggedness measures fractal fitness landscapes. paper deals using chaos direct trajectories targets analyzes ruggedness fractality resulting fitness landscapes. targeting problem formulated dynamic fitness landscape four different chaotic maps generating landscape studied. using computational approach, analyze properties landscapes quantify fractal rugged characteristics. particular, shown ruggedness measures correlation length information content scale-invariant self-similar.",13 "npglm: non-parametric method temporal link prediction. paper, try solve problem temporal link prediction information networks. implies predicting time takes link appear future, given features extracted current network snapshot. end, introduce probabilistic non-parametric approach, called ""non-parametric generalized linear model"" (np-glm), infers hidden underlying probability distribution link advent time given features. present learning algorithm np-glm inference method answer time-related queries. extensive experiments conducted synthetic data real-world sina weibo social network demonstrate effectiveness np-glm solving temporal link prediction problem vis-a-vis competitive baselines.",4 "publishing linking transport data web. without linked data, transport data limited applications exclusively around transport. paper, present workflow publishing linking transport data web. able develop transport applications add features created datasets. possible transport data linked datasets. apply workflow two datasets: neptune, french standard describing transport line, passim, directory containing relevant information transport services, every french city.",4 "inducing interpretability knowledge graph embeddings. study problem inducing interpretability kg embeddings. specifically, explore universal schema (riedel et al., 2013) propose method induce interpretability. many vector space models proposed problem, however, methods address interpretability (semantics) individual dimensions. work, study problem propose method inducing interpretability kg embeddings using entity co-occurrence statistics. proposed method significantly improves interpretability, maintaining comparable performance kg tasks.",4 "stochastic metamorphosis template uncertainties. paper, investigate two stochastic perturbations metamorphosis equations image analysis, geometrical context euler-poincar\'e theory. metamorphosis images, lie group diffeomorphisms deforms template image undergoing internal dynamics deforms. type deformation allows freedom image matching analogies complex fluids template properties regarded order parameters (coset spaces broken symmetries). first stochastic perturbation consider corresponds uncertainty due random errors reconstruction deformation map vector field. also consider second stochastic perturbation, compounds uncertainty deformation map uncertainty reconstruction template position velocity field. apply general geometric theory several classical examples, including landmarks, images, closed curves, discuss use functional data analysis.",4 "homotopy parametric simplex method sparse learning. high dimensional sparse learning imposed great computational challenge large scale data analysis. paper, interested broad class sparse learning approaches formulated linear programs parametrized {\em regularization factor}, solve parametric simplex method (psm). parametric simplex method offers significant advantages competing methods: (1) psm naturally obtains complete solution path values regularization parameter; (2) psm provides high precision dual certificate stopping criterion; (3) psm yields sparse solutions iterations, solution sparsity significantly reduces computational cost per iteration. particularly, demonstrate superiority psm various sparse learning approaches, including dantzig selector sparse linear regression, lad-lasso sparse robust linear regression, clime sparse precision matrix estimation, sparse differential network estimation, sparse linear programming discriminant (lpd) analysis. provide sufficient conditions psm always outputs sparse solutions computational performance significantly boosted. thorough numerical experiments provided demonstrate outstanding performance psm method.",4 "neuro-mathematical model geometrical optical illusions. geometrical optical illusions object many studies due possibility offer understand behaviour low-level visual processing. consist situations perceived geometrical properties object differ object visual stimulus. starting geometrical model introduced citti sarti [3], provide mathematical model computational algorithm allows interpret phenomena qualitatively reproduce perceived misperception.",4 "logic-based approach generatively defined discriminative modeling. conditional random fields (crfs) usually specified graphical models paper propose use probabilistic logic programs specify generatively. intension first provide unified approach crfs complex modeling use turing complete language second offer convenient way realizing generative-discriminative pairs machine learning compare generative discriminative models choose best model. implemented approach d-prism language modifying prism, logic-based probabilistic modeling language generative modeling, exploiting dynamic programming mechanism efficient probability computation. tested d-prism logistic regression, linear-chain crf crf-cfg empirically confirmed excellent discriminative performance compared generative counterparts, i.e.\ naive bayes, hmm pcfg. also introduced new crf models, crf-bncs crf-lcgs. crf versions bayesian network classifiers probabilistic left-corner grammars respectively easily implementable d-prism. empirically showed outperform generative counterparts expected.",4 "mdps unawareness. markov decision processes (mdps) widely used modeling decision-making problems robotics, automated control, economics. traditional mdps assume decision maker (dm) knows states actions. however, may true many situations interest. define new framework, mdps unawareness (mdpus) deal possibilities dm may aware possible actions. provide complete characterization dm learn play near-optimally mdpu, give algorithm learns play near-optimally possible so, efficiently possible. particular, characterize near-optimal solution found polynomial time.",4 "weighted unsupervised learning 3d object detection. paper introduces novel weighted unsupervised learning object detection using rgb-d camera. technique feasible detecting moving objects noisy environments captured rgb-d camera. main contribution paper real-time algorithm detecting object using weighted clustering separate cluster. preprocessing step, algorithm calculates pose 3d position x, y, z rgb color data point calculates data point's normal vector using point's neighbor. preprocessing, algorithm calculates k-weights data point; weight indicates membership. resulting clustered objects scene.",4 "prediction using note text: synthetic feature creation word2vec. word2vec affords simple yet powerful approach extracting quantitative variables unstructured textual data. half healthcare data unstructured therefore hard model without involved expertise data engineering natural language processing. word2vec serve bridge quickly gather intelligence data sources. study, ran 650 megabytes unstructured, medical chart notes providence health & services electronic medical record word2vec. used two different approaches creating predictive variables tested risk readmission patients copd (chronic obstructive lung disease). comparative benchmark, ran test using lace risk model (a single score based length stay, acuity, comorbid conditions, emergency department visits). using free text mathematical might, found word2vec comparable lace predicting risk readmission copd patients.",4 "polynomial value iteration algorithms detrerminstic mdps. value iteration commonly used empirically competitive method solving many markov decision process problems. however, known value iteration pseudo-polynomial complexity general. establish somewhat surprising polynomial bound value iteration deterministic markov decision (dmdp) problems. show basic value iteration procedure converges highest average reward cycle dmdp problem heta(n^2) iterations, heta(mn^2) total time, n denotes number states, number edges. give two extensions value iteration solve dmdp heta(mn) time. explore analysis policy iteration algorithms report empirical study value iteration showing convergence much faster random sparse graphs.",4 "empirically analyzing effect dataset biases deep face recognition systems. unknown kind biases modern wild face datasets lack annotation. direct consequence total recognition rates alone provide limited insight generalization ability deep convolutional neural networks (dcnns). propose empirically study effect different types dataset biases generalization ability dcnns. using synthetically generated face images, study face recognition rate function interpretable parameters face pose light. proposed method allows valuable details generalization performance different dcnn architectures observed compared. experiments, find that: 1) indeed, dataset bias significant influence generalization performance dcnns. 2) dcnns generalize surprisingly well unseen illumination conditions large sampling gaps pose variation. 3) uncover main limitation current dcnn architectures, difficulty generalize different identities share pose variation. 4) demonstrate findings synthetic data also apply learning real world data. face image generator publicly available enable community benchmark face recognition systems common ground.",4 "iterative algorithm fitting nonconvex penalized generalized linear models grouped predictors. high-dimensional data pose challenges statistical learning modeling. sometimes predictors naturally grouped pursuing between-group sparsity desired. collinearity may occur real-world high-dimensional applications popular $l_1$ technique suffers selection inconsistency prediction inaccuracy. moreover, problems interest often go beyond gaussian models. meet challenges, nonconvex penalized generalized linear models grouped predictors investigated simple-to-implement algorithm proposed computation. rigorous theoretical result guarantees convergence provides tight preliminary scaling. framework allows grouped predictors nonconvex penalties, including discrete $l_0$ `$l_0+l_2$' type penalties. penalty design parameter tuning nonconvex penalties examined. applications super-resolution spectrum estimation signal processing cancer classification joint gene selection bioinformatics show performance improvement nonconvex penalized estimation.",19 "classification ensembles neural networks. introduce new procedure training artificial neural networks using approximation objective function arithmetic mean ensemble selected randomly generated neural networks, apply procedure classification (or pattern recognition) problem. approach differs standard one based optimization theory. particular, neural network mentioned ensemble may approximation objective function.",4 "inferring location authors words texts. purposes computational dialectology geographically bound text analysis tasks, texts must annotated authors' location. many texts locatable explicit labels explicit annotation place. paper describes series experiments determine positionally annotated microblog posts used learn location-indicating words used locate blog texts authors. gaussian distribution used model locational qualities words. introduce notion placeness describe locational words are. find modelling word distributions account several locations thus several gaussian distributions per word, defining filter picks words high placeness based local distributional context, aggregating locational information centroid text gives useful results. results applied data swedish language.",4 "neutrality many-valued logics. book, consider various many-valued logics: standard, linear, hyperbolic, parabolic, non-archimedean, p-adic, interval, neutrosophic, etc. survey also results show tree different proof-theoretic frameworks many-valued logics, e.g. frameworks following deductive calculi: hilbert's style, sequent, hypersequent. present general way allows construct systematically analytic calculi large family non-archimedean many-valued logics: hyperrational-valued, hyperreal-valued, p-adic valued logics characterized special format semantics appropriate rejection archimedes' axiom. logics built different extensions standard many-valued logics (namely, lukasiewicz's, goedel's, product, post's logics). informal sense archimedes' axiom anything measured ruler. also logical multiple-validity without archimedes' axiom consists set truth values infinite well-founded well-ordered. base non-archimedean valued logics, construct non-archimedean valued interval neutrosophic logic inl describe neutrality phenomena.",4 "regression trees random forest based feature selection malaria risk exposure prediction. paper deals prediction anopheles number, main vector malaria risk, using environmental climate variables. variables selection based automatic machine learning method using regression trees, random forests combined stratified two levels cross validation. minimum threshold variables importance accessed using quadratic distance variables importance optimal subset selected variables used perform predictions. finally results revealed qualitatively better, selection, prediction , cpu time point view obtained glm-lasso method.",19 "mml consistent neyman-scott. strict minimum message length (smml) statistical inference method widely cited (but informal arguments) providing estimations consistent general estimation problems. is, however, almost invariably intractable compute, reason approximations (known mml algorithms) ever used practice. investigate neyman-scott estimation problem, oft-cited showcase consistency mml, show even natural choice prior, neither smml popular approximations consistent it, thereby providing counterexample general claim. first known explicit construction smml solution natural, high-dimensional problem. use novel construction methods refute claims regarding mml also appearing literature.",19 "planning based framework essay generation. generating article automatically computer program challenging task artificial intelligence natural language processing. paper, target essay generation, takes input topic word mind generates organized article theme topic. follow idea text planning \cite{reiter1997} develop essay generation framework. framework consists three components, including topic understanding, sentence extraction sentence reordering. component, studied several statistical algorithms empirically compared terms qualitative quantitative analysis. although run experiments chinese corpus, method language independent easily adapted language. lay remaining challenges suggest avenues future research.",4 "deep ordinal ranking multi-category diagnosis alzheimer's disease using hippocampal mri data. increasing effort brain image analysis dedicated early diagnosis alzheimer's disease (ad) based neuroimaging data. existing studies focusing binary classification problems, e.g., distinguishing ad patients normal control (nc) elderly mild cognitive impairment (mci) individuals nc elderly. however, identifying individuals ad mci, especially mci individuals convert ad (progressive mci, pmci), single setting, needed achieve goal early diagnosis ad. paper, propose deep ordinal ranking model distinguishing nc, stable mci (smci), pmci, ad individual subject level, taking account inherent ordinal severity brain degeneration caused normal aging, mci, ad, rather formulating classification multi-category classification problem. proposed deep ordinal ranking model focuses hippocampal morphology individuals learns informative discriminative features automatically. experiment results based large cohort individuals alzheimer's disease neuroimaging initiative (adni) indicate proposed method achieve better performance traditional multi-category classification techniques using shape radiomics features structural magnetic resonance imaging (mri) data.",4 "visualizing understanding neural models nlp. neural networks successfully applied many nlp tasks resulting vector-based models difficult interpret. example clear achieve {\em compositionality}, building sentence meaning meanings words phrases. paper describe four strategies visualizing compositionality neural models nlp, inspired similar work computer vision. first plot unit values visualize compositionality negation, intensification, concessive clauses, allow us see well-known markedness asymmetries negation. introduce three simple straightforward methods visualizing unit's {\em salience}, amount contributes final composed meaning: (1) gradient back-propagation, (2) variance token average word node, (3) lstm-style gates measure information flow. test methods sentiment using simple recurrent nets lstms. general-purpose methods may wide applications understanding compositionality semantic properties deep networks , also shed light lstms outperform simple recurrent nets,",4 "segmentation free object discovery video. paper present simple yet effective approach extend without supervision object proposal static images videos. unlike previous methods, spatio-temporal proposals, refer tracks, generated relying little visual content exploiting bounding boxes spatial correlations time. tracks obtain likely represent objects general-purpose tool represent meaningful video content wide variety tasks. unannotated videos, tracks used discover content without supervision. contribution also propose novel dataset-independent method evaluate generic object proposal based entropy classifier output response. experiment two competitive datasets, namely youtube objects ilsvrc-2015 vid.",4 "labelfusion: pipeline generating ground truth labels real rgbd data cluttered scenes. deep neural network (dnn) architectures shown outperform traditional pipelines object segmentation pose estimation using rgbd data, performance dnn pipelines directly tied representative training data true data. hence key requirement employing methods practice large set labeled data specific robotic manipulation task, requirement generally satisfied existing datasets. paper develop pipeline rapidly generate high quality rgbd data pixelwise labels object poses. use rgbd camera collect video scene multiple viewpoints leverage existing reconstruction techniques produce 3d dense reconstruction. label 3d reconstruction using human assisted icp-fitting object meshes. reprojecting results labeling 3d scene produce labels rgbd image scene. pipeline enabled us collect 1,000,000 labeled object instances days. use dataset answer questions related much training data required, quality data must be, achieve high performance dnn architecture.",4 automated assignment backbone nmr data using artificial intelligence. nuclear magnetic resonance (nmr) spectroscopy powerful method investigation three-dimensional structures biological molecules proteins. determining protein structure essential understanding function alterations function lead disease. one major challenges post-genomic era obtain structural functional information many unknown proteins encoded thousands newly identified genes. goal research design algorithm capable automating analysis backbone protein nmr data implementing ai strategies greedy a* search.,4 "model virtual carrier immigration digital images region segmentation. novel model image segmentation proposed, inspired carrier immigration mechanism physical p-n junction. carrier diffusing drifting simulated proposed model, imitates physical self-balancing mechanism p-n junction. effect virtual carrier immigration digital images analyzed studied experiments test images real world images. sign distribution net carrier model's balance state exploited region segmentation. experimental results test images real-world images demonstrate self-adaptive meaningful gathering pixels suitable regions, prove effectiveness proposed method image region segmentation.",4 "evolutionary model turing machines. development large non-coding fraction eukaryotic dna phenomenon code-bloat field evolutionary computations show striking similarity. seems suggest (in presence mechanisms code growth) evolution complex code can't attained without maintaining large inactive fraction. test hypothesis performed computer simulations evolutionary toy model turing machines, studying relations among fitness coding/non-coding ratio varying mutation code growth rates. results suggest that, model, large reservoir non-coding states constitutes great (long term) evolutionary advantage.",16 "smash: physics-guided reconstruction collisions videos. collision sequences commonly used games entertainment add drama excitement. authoring even two body collisions real world difficult, one get timing object trajectories correctly synchronized. tedious trial-and-error iterations, objects actually made collide, difficult capture 3d. contrast, synthetically generating plausible collisions difficult requires adjusting different collision parameters (e.g., object mass ratio, coefficient restitution, etc.) appropriate initial parameters. present smash directly read appropriate collision parameters directly raw input video recordings. technically enable utilizing laws rigid body collision regularize problem lifting 2d trajectories physically valid 3d reconstruction collision. reconstructed sequences modified combined easily author novel plausible collisions. evaluate system range synthetic scenes demonstrate effectiveness method accurately reconstructing several complex real world collision events.",4 "image authentication based neural networks. neural network attracting researchers since past decades. properties, parameter sensitivity, random similarity, learning ability, etc., make suitable information protection, data encryption, data authentication, intrusion detection, etc. paper, investigating neural networks' properties, low-cost authentication method based neural networks proposed used authenticate images videos. authentication method detect whether images videos modified maliciously. firstly, chapter introduces neural networks' properties, parameter sensitivity, random similarity, diffusion property, confusion property, one-way property, etc. secondly, chapter gives introduction neural network based protection methods. thirdly, image video authentication scheme based neural networks presented, performances, including security, robustness efficiency, analyzed. finally, conclusions drawn, open issues field presented.",4 "want answers? reddit inspired study pose questions. questions form integral part everyday communication, offline online. getting responses questions others fundamental satisfying information need extending knowledge boundaries. question may represented using various factors social, syntactic, semantic, etc. hypothesize factors contribute varying degrees towards getting responses others given question. perform thorough empirical study measure effects factors using novel question answer dataset website reddit.com. best knowledge, first analysis kind important topic. also use sparse nonnegative matrix factorization technique automatically induce interpretable semantic factors question dataset. also document various patterns response prediction observe analysis data. instance, found preference-probing questions scantily answered. method robust capture latent response factors. hope make code datasets publicly available upon publication paper.",4 "experimental comparison several clustering initialization methods. examine methods clustering high dimensions. first part paper, perform experimental comparison three batch clustering algorithms: expectation-maximization (em) algorithm, winner take version em algorithm reminiscent k-means algorithm, model-based hierarchical agglomerative clustering. learn naive-bayes models hidden root node, using high-dimensional discrete-variable data sets (both real synthetic). find em algorithm significantly outperforms methods, proceed investigate effect various initialization schemes final solution produced em algorithm. initializations consider (1) parameters sampled uninformative prior, (2) random perturbations marginal distribution data, (3) output hierarchical agglomerative clustering. although methods substantially different, lead learned models strikingly similar quality.",4 "smoothed low rank sparse matrix recovery iteratively reweighted least squares minimization. work presents general framework solving low rank and/or sparse matrix minimization problems, may involve multiple non-smooth terms. iteratively reweighted least squares (irls) method fast solver, smooths objective function minimizes alternately updating variables weights. however, traditional irls solve sparse low rank minimization problem squared loss affine constraint. work generalizes irls solve joint/mixed low rank sparse minimization problems, essential formulations many tasks. concrete example, solve schatten-$p$ norm $\ell_{2,q}$-norm regularized low-rank representation (lrr) problem irls, theoretically prove derived solution stationary point (globally optimal $p,q\geq1$). convergence proof irls general previous one depends special properties schatten-$p$ norm $\ell_{2,q}$-norm. extensive experiments synthetic real data sets demonstrate irls much efficient.",4 "detecting adversarial samples using density ratio estimates. machine learning models, especially based deep architectures used everyday applications ranging self driving cars medical diagnostics. shown models dangerously susceptible adversarial samples, indistinguishable real samples human eye, adversarial samples lead incorrect classifications high confidence. impact adversarial samples far-reaching efficient detection remains open problem. propose use direct density ratio estimation efficient model agnostic measure detect adversarial samples. proposed method works equally well single multi-channel samples, different adversarial sample generation methods. also propose method use density ratio estimates generating adversarial samples added constraint preserving density ratio.",4 "joint framework argumentative text analysis incorporating domain knowledge. argumentation mining, several sub-tasks argumentation component type classification, relation classification. existing research tends solve sub-tasks separately, ignore close relation them. paper, present joint framework incorporating logical relation sub-tasks improve performance argumentation structure generation. design objective function combine predictions individual models sub-task solve problem constraints constructed background knowledge. evaluate proposed model two public corpora experiment results show model outperform baseline uses separate model significantly sub-task. model also shows advantages component-related sub-tasks compared state-of-the-art joint model based evidence graph.",4 "analyzing users' sentiment towards popular consumer industries brands twitter. social media serves unified platform users express thoughts subjects ranging daily lives opinion consumer brands products. users wield enormous influence shaping opinions consumers influence brand perception, brand loyalty brand advocacy. paper, analyze opinion 19m twitter users towards 62 popular industries, encompassing 12,898 enterprise consumer brands, well associated subject matter topics, via sentiment analysis 330m tweets period spanning month. find users tend positive towards manufacturing negative towards service industries. addition, tend positive negative interacting brands generally twitter. also find sentiment towards brands within industry varies greatly demonstrate using two industries use cases. addition, discover strong correlation topic sentiments different industries, demonstrating topic sentiments highly dependent context industry mentioned in. demonstrate value analysis order assess impact brands social media. hope initial study prove valuable researchers companies understanding users' perception industries, brands associated topics encourage research field.",4 "operator entity extraction mapreduce. dictionary-based entity extraction involves finding mentions dictionary entities text. text mentions often noisy, containing spurious missing words. efficient algorithms detecting approximate entity mentions follow one two general techniques. first approach build index entities perform index lookups document substrings. second approach recognizes number substrings generated documents explode large numbers, get around this, use filter prune many substrings match dictionary entity verify remaining substrings entity mentions dictionary entities, means text join. choice index-based approach filter & verification-based approach case-to-case decision best approach depends characteristics input entity dictionary, example frequency entity mentions. choosing right approach setting make substantial difference execution time. making choice however non-trivial parameters within approaches make space possible approaches large. paper, present cost-based operator making choice among execution plans entity extraction. since need deal large dictionaries even larger large datasets, operator developed implementations mapreduce distributed algorithms.",4 "fixed-point coordinate descent algorithms regularized kernel methods. paper, study two general classes optimization algorithms kernel methods convex loss function quadratic norm regularization, analyze convergence. first approach, based fixed-point iterations, simple implement analyze, easily parallelized. second, based coordinate descent, exploits structure additively separable loss functions compute solutions line searches closed form. instances general classes algorithms already incorporated state art machine learning software large scale problems. start solution characterization regularized problem, obtained using sub-differential calculus resolvents monotone operators, holds general convex loss functions regardless differentiability. two methodologies described paper regarded instances non-linear jacobi gauss-seidel algorithms, well-suited solve large scale problems.",4 "neutrosophic entropy five components. paper presents two variants penta-valued representation neutrosophic entropy. first extension kaufmann's formula second extension kosko's formula. based primary three-valued information represented degree truth, degree falsity degree neutrality built penta-valued representations better highlights specific features neutrosophic entropy. thus, highlight five features neutrosophic uncertainty ambiguity, ignorance, contradiction, neutrality saturation. five features supplemented seven partition unity adding two features neutrosophic certainty truth falsity. paper also presents particular forms neutrosophic entropy obtained case bifuzzy representations, intuitionistic fuzzy representations, paraconsistent fuzzy representations finally case fuzzy representations.",4 "online control false discovery rate decaying memory. online multiple testing problem, p-values corresponding different null hypotheses observed one one, decision whether reject current hypothesis must made immediately, next p-value observed. alpha-investing algorithms control false discovery rate (fdr), formulated foster stine, generalized applied many settings, including quality-preserving databases science multiple a/b multi-armed bandit tests internet commerce. paper improves class generalized alpha-investing algorithms (gai) four ways: (a) show uniformly improve power entire class monotone gai procedures awarding alpha-wealth rejection, giving win-win resolution recent dilemma raised javanmard montanari, (b) demonstrate incorporate prior weights indicate domain knowledge hypotheses likely non-null, (c) allow differing penalties false discoveries indicate hypotheses may important others, (d) define new quantity called decaying memory false discovery rate (mem-fdr) may meaningful truly temporal applications, alleviates problems describe refer ""piggybacking"" ""alpha-death"". gai++ algorithms incorporate four generalizations simultaneously, reduce powerful variants earlier algorithms weights decay set unity. finally, also describe simple method derive new online fdr rules based estimated false discovery proportion.",19 """liar, liar pants fire"": new benchmark dataset fake news detection. automatic fake news detection challenging problem deception detection, tremendous real-world political social impacts. however, statistical approaches combating fake news dramatically limited lack labeled benchmark datasets. paper, present liar: new, publicly available dataset fake news detection. collected decade-long, 12.8k manually labeled short statements various contexts politifact.com, provides detailed analysis report links source documents case. dataset used fact-checking research well. notably, new dataset order magnitude larger previously largest public fake news datasets similar type. empirically, investigate automatic fake news detection based surface-level linguistic patterns. designed novel, hybrid convolutional neural network integrate meta-data text. show hybrid approach improve text-only deep learning model.",4 "dotmark - benchmark discrete optimal transport. wasserstein metric earth mover's distance (emd) useful tool statistics, machine learning computer science many applications biological medical imaging, among others. especially light increasingly complex data, computation distances via optimal transport often limiting factor. inspired challenge, variety new approaches optimal transport proposed recent years along new methods comes need meaningful comparison. paper, introduce benchmark discrete optimal transport, called dotmark, designed serve neutral collection problems, discrete optimal transport methods tested, compared one another, brought limits large-scale instances. consists variety grayscale images, various resolutions classes, several types randomly generated images, classical test images real data microscopy. along dotmark present survey performance test cross section established methods ranging traditional algorithms, transportation simplex, recently developed approaches, shielding neighborhood method, including also comparison commercial solvers.",12 "generalised reichenbachian common cause systems. principle common cause claims improbable coincidence occurred, must exist common cause. generally taken mean positive correlations non-causally related events disappear conditioning action underlying common cause. extended interpretation principle, contrast, urges common causes called order explain positive deviations estimated correlation two events expected value correlation. aim paper provide extended reading principle general probabilistic model, capturing simultaneous action system multiple common causes. end, two distinct models elaborated, necessary sufficient conditions existence determined.",19 "ontological architecture orbital debris data. orbital debris problem presents opportunity inter-agency international cooperation toward mutually beneficial goals debris prevention, mitigation, remediation, improved space situational awareness (ssa). achieving goals requires sharing orbital debris ssa data. toward this, present ontological architecture orbital debris domain, taking steps creation orbital debris ontology (odo). purpose ontological system (i) represent general orbital debris ssa domain knowledge, (ii) structure, standardize needed, orbital data terminology, (iii) foster semantic interoperability data-sharing. hope (iv) contribute solving orbital debris problem, improving peaceful global ssa, ensuring safe space travel future generations.",4 "distributed weighted parameter averaging svm training big data. two popular approaches distributed training svms big data parameter averaging admm. parameter averaging efficient suffers loss accuracy increase number partitions, admm feature space accurate suffers slow convergence. paper, report hybrid approach called weighted parameter averaging (wpa), optimizes regularized hinge loss respect weights parameters. problem shown solving svm projected space. also demonstrate $o(\frac{1}{n})$ stability bound final hypothesis given wpa, using novel proof techniques. experimental results variety toy real world datasets show approach significantly accurate parameter averaging high number partitions. also seen proposed method enjoys much faster convergence compared admm features space.",4 "algorithms, initializations, convergence nonnegative matrix factorization. well known good initializations improve speed accuracy solutions many nonnegative matrix factorization (nmf) algorithms. many nmf algorithms sensitive respect initialization w h both. especially true algorithms alternating least squares (als) type, including two new als algorithms present paper. compare results six initialization procedures (two standard four new) als algorithms. lastly, discuss practical issue choosing appropriate convergence criterion.",4 "automatic detection diabetes diagnosis using feature weighted support vector machines based mutual information modified cuckoo search. diabetes major health problem developing developed countries incidence rising dramatically. study, investigate novel automatic approach diagnose diabetes disease based feature weighted support vector machines (fw-svms) modified cuckoo search (mcs). proposed model consists three stages: firstly, pca applied select optimal subset features set features. secondly, mutual information employed construct fwsvm weighting different features based degree importance. finally, since parameter selection plays vital role classification accuracy svms, mcs applied select best parameter values. proposed mi-mcs-fwsvm method obtains 93.58% accuracy uci dataset. experimental results demonstrate method outperforms previous methods giving accurate results also significantly speeding classification procedure.",4 "learning wasserstein loss. learning predict multi-label outputs challenging, many problems natural metric outputs used improve predictions. paper develop loss function multi-label learning, based wasserstein distance. wasserstein distance provides natural notion dissimilarity probability measures. although optimizing respect exact wasserstein distance costly, recent work described regularized approximation efficiently computed. describe efficient learning algorithm based regularization, well novel extension wasserstein distance probability measures unnormalized measures. also describe statistical learning bound loss. wasserstein loss encourage smoothness predictions respect chosen metric output space. demonstrate property real-data tag prediction problem, using yahoo flickr creative commons dataset, outperforming baseline use metric.",4 "robust global localization using clustered particle filtering. global mobile robot localization problem determining robot's pose environment, using sensor data, starting position unknown. family probabilistic algorithms known monte carlo localization (mcl) currently among popular methods solving problem. mcl algorithms represent robot's belief set weighted samples, approximate posterior probability robot located using bayesian formulation localization problem. article presents extension mcl algorithm, addresses problems localizing highly symmetrical environments; situation mcl often unable correctly track equally probable poses robot. problem arises fact sample sets mcl often become impoverished, samples generated according posterior likelihood. approach incorporates idea clusters samples modifies proposal distribution considering probability mass clusters. experimental results presented show new extension mcl algorithm successfully localizes symmetric environments ordinary mcl often fails.",4 "machine learning bioclimatic modelling. many machine learning (ml) approaches widely used generate bioclimatic models prediction geographic range organism function climate. applications prediction range shift organism, range invasive species influenced climate change important parameters understanding impact climate change. however, success machine learning-based approaches depends number factors. safely said particular ml technique effective applications success technique predominantly dependent application type problem, useful understand behavior ensure informed choice techniques. paper presents comprehensive review machine learning-based bioclimatic model generation analyses factors influencing success models. considering wide use statistical techniques, discussion also include conventional statistical techniques used bioclimatic modelling.",4 "constructing non-negative low rank sparse graph data-adaptive features. paper aims constructing good graph discovering intrinsic data structures semi-supervised learning setting. firstly, propose build non-negative low-rank sparse (referred nnlrs) graph given data representation. specifically, weights edges graph obtained seeking nonnegative low-rank sparse matrix represents data sample linear combination others. so-obtained nnlrs-graph capture global mixture subspaces structure (by low rankness) locally linear structure (by sparseness) data, hence generative discriminative. secondly, good features extremely important constructing good graph, propose learn data embedding matrix construct graph jointly within one framework, termed nnlrs embedded features (referred nnlrs-ef). extensive experiments three publicly available datasets demonstrate proposed method outperforms state-of-the-art graph construction method large margin semi-supervised classification discriminative analysis, verifies effectiveness proposed method.",4 "binary matrix completion using unobserved entries. matrix completion problem, aims recover complete matrix partial observations, one important problems machine learning field studied actively. however, discrepancy mainstream problem setting, assumes continuous-valued observations, practical applications recommendation systems sns link predictions observations take discrete even binary values. cope problem, davenport et al. (2014) proposed binary matrix completion (bmc) problem, observations quantized binary values. hsieh et al. (2015) proposed pu (positive unlabeled) matrix completion problem, extension bmc problem. problem targets setting cannot observe negative values, sns link predictions. construction method setting, introduced methodology classification problem, regarding matrix entry sample. risk, defines losses unobserved entries well, indicates possibility use unobserved entries. paper, motivated semi-supervised classification method recently proposed sakai et al. (2017), develop method bmc problem use positive, negative, unobserved entries, combining risks davenport et al. (2014) hsieh et al. (2015). best knowledge, first bmc method exploits kinds matrix entries. experimentally show appropriate mixture risks improves performance.",19 "hierarchical spatial transformer network. computer vision researchers expecting neural networks spatial transformation ability eliminate interference caused geometric distortion long time. emergence spatial transformer network makes dream come true. spatial transformer network variants handle global displacement well, lack ability deal local spatial variance. hence achieve better manner deformation neural network become pressing matter moment. address issue, analyze advantages disadvantages approximation theory optical flow theory, combine propose novel way achieve image deformation implement hierarchical convolutional neural network. new approach solves linear deformation along optical flow field model image deformation. experiments cluttered mnist handwritten digits classification image plane alignment, method outperforms baseline methods large margin.",4 "towards continuous knowledge learning engine chatbots. although chatbots popular recent years, still serious weaknesses limit scope applications. one major weakness cannot learn new knowledge conversation process, i.e., knowledge fixed beforehand cannot expanded updated conversation. paper, propose build general knowledge learning engine chatbots enable continuously interactively learn new knowledge conversations. time goes by, become knowledgeable better better learning conversation. model task open-world knowledge base completion problem propose novel technique called lifelong interactive learning inference (lili) solve it. lili works imitating humans acquire knowledge perform inference interactive conversation. experimental results show lili highly promising.",4 "solving ""false positives"" problem fraud prediction. paper, present automated feature engineering based approach dramatically reduce false positives fraud prediction. false positives plague fraud prediction industry. estimated 1 5 declared fraud actually fraud roughly 1 every 6 customers valid transaction declined past year. address problem, use deep feature synthesis algorithm automatically derive behavioral features based historical data card associated transaction. generate 237 features (>100 behavioral patterns) transaction, use random forest learn classifier. tested machine learning model data large multinational bank compared existing solution. unseen data 1.852 million transactions, able reduce false positives 54% provide savings 190k euros. also assess deploy solution, whether necessitates streaming computation real time scoring. found solution maintain similar benefits even historical features computed every 7 days.",4 "top-k query answering datalog+/- ontologies subjective reports (technical report). use preferences query answering, traditional databases ontology-based data access, recently received much attention, due many real-world applications. paper, tackle problem top-k query answering datalog+/- ontologies subject querying user's preferences collection (subjective) reports users. here, report consists scores list features, author's preferences among features, well information. theses pieces information every report combined, along querying user's preferences his/her trust report, rank query results. present two alternative rankings, along algorithms top-k (atomic) query answering rankings. also show that, suitable assumptions, algorithms run polynomial time data complexity. finally present general reports, associated sets atoms rather single atoms.",4 "automatic method finding topic boundaries. article outlines new method locating discourse boundaries based lexical cohesion graphical technique called dotplotting. application dotplotting discourse segmentation performed either manually, examining graph, automatically, using optimization algorithm. results two experiments involving automatically locating boundaries series concatenated documents presented. areas application future directions work also outlined.",2 "survey stealth malware: attacks, mitigation measures, steps toward autonomous open world solutions. professional, social, financial existences become increasingly digitized government, healthcare, military infrastructures rely computer technologies, present larger lucrative targets malware. stealth malware particular poses increased threat specifically designed evade detection mechanisms, spreading dormant, wild extended periods time, gathering sensitive information positioning high-impact zero-day attack. policing growing attack surface requires development efficient anti-malware solutions improved generalization detect novel types malware resolve occurrences little burden human experts possible. paper, survey malicious stealth technologies well existing solutions detecting categorizing countermeasures autonomously. machine learning offers promising potential increasingly autonomous solutions improved generalization new malware types, network level host level, findings suggest several flawed assumptions inherent recognition algorithms prevent direct mapping stealth malware recognition problem machine learning solution. notable flawed assumptions closed world assumption: sample belonging class outside static training set appear query time. present formalized adaptive open world framework stealth malware recognition relate mathematically research machine learning domains.",4 "restricted manipulation iterative voting: convergence condorcet efficiency. collective decision making, voting rule used take collective decision among group agents, manipulation one agents usually considered negative behavior avoided, least made computationally difficult agents perform. however, scenarios restricted form manipulation instead beneficial. paper consider iterative version several voting rules, step one agent allowed manipulate modifying ballot according set restricted manipulation moves computationally easy require little information performed. prove convergence iterative voting rules restricted manipulation allowed, present experiments showing iterative voting rules higher condorcet efficiency non-iterative version.",4 "improved sparse low-rank matrix estimation. address problem estimating sparse low-rank matrix noisy observation. propose objective function consisting data-fidelity term two parameterized non-convex penalty functions. further, show set parameters non-convex penalty functions, order ensure objective function strictly convex. proposed objective function better estimates sparse low-rank matrices convex method utilizes sum nuclear norm $\ell_1$ norm. derive algorithm (as instance admm) solve proposed problem, guarantee convergence provided scalar augmented lagrangian parameter set appropriately. demonstrate proposed method denoising audio signal adjacency matrix representing protein interactions `escherichia coli' bacteria.",12 "thickness mapping eleven retinal layers normal eyes using spectral domain optical coherence tomography. purpose. study conducted determine thickness map eleven retinal layers normal subjects spectral domain optical coherence tomography (sd-oct) evaluate association sex age. methods. mean regional retinal thickness 11 retinal layers obtained automatic three-dimensional diffusion-map-based method 112 normal eyes 76 iranian subjects. results. thickness map central foveal area layer 1, 3, 4 displayed minimum thickness (p<0.005 all). maximum thickness observed nasal fovea layer 1 (p<0.001) circular pattern parafoveal retinal area layers 2, 3 4 central foveal area layer 6 (p<0.001). temporal inferior quadrants total retinal thickness quadrants layer 1 significantly greater men women. surrounding eight sectors total retinal thickness limited number sectors layer 1 4 significantly correlated age. conclusion. sd-oct demonstrated three-dimensional thickness distribution retinal layers normal eyes. thickness layers varied sex age different sectors. variables considered evaluating macular thickness.",4 "learning executable neural semantic parser. paper describes neural semantic parser maps natural language utterances onto logical forms executed task-specific environment, knowledge base database, produce response. parser generates tree-structured logical forms transition-based approach combines generic tree-generation algorithm domain-general operations defined logical language. generation process modeled structured recurrent neural networks, provide rich encoding sentential context generation history making predictions. tackle mismatches natural language logical form tokens, various attention mechanisms explored. finally, consider different training settings neural semantic parser, including fully supervised training annotated logical forms given, weakly-supervised training denotations provided, distant supervision unlabeled sentences knowledge base available. experiments across wide range datasets demonstrate effectiveness parser.",4 "towards accurate multi-person pose estimation wild. propose method multi-person detection 2-d pose estimation achieves state-of-art results challenging coco keypoints task. simple, yet powerful, top-down approach consisting two stages. first stage, predict location scale boxes likely contain people; use faster rcnn detector. second stage, estimate keypoints person potentially contained proposed bounding box. keypoint type predict dense heatmaps offsets using fully convolutional resnet. combine outputs introduce novel aggregation procedure obtain highly localized keypoint predictions. also use novel form keypoint-based non-maximum-suppression (nms), instead cruder box-level nms, novel form keypoint-based confidence score estimation, instead box-level scoring. trained coco data alone, final system achieves average precision 0.649 coco test-dev set 0.643 test-standard sets, outperforming winner 2016 coco keypoints challenge recent state-of-art. further, using additional in-house labeled data obtain even higher average precision 0.685 test-dev set 0.673 test-standard set, 5% absolute improvement compared previous best performing method dataset.",4 "infinity computable probability. show, contrary classical supposition, process generating symbols according probability distribution need not, likelihood, produce given finite text finite time, even guaranteed produce text infinite time. result extends target-free text generation implications simulations probabilistic processes.",12 "expressive power word embeddings. seek better understand difference quality several publicly released embeddings. propose several tasks help distinguish characteristics different embeddings. evaluation sentiment polarity synonym/antonym relations shows embeddings able capture surprisingly nuanced semantics even absence sentence structure. moreover, benchmarking embeddings shows great variance quality characteristics semantics captured tested embeddings. finally, show impact varying number dimensions resolution dimension effective useful features captured embedding space. contributions highlight importance embeddings nlp tasks effect quality final results.",4 "multi-level coding efficiency improved quality image compression based ambtc. paper, proposed extended version absolute moment block truncation coding (ambtc) compress images. generally elements bitplane used variants block truncation coding (btc) size 1 bit. extended two bits proposed method. number statistical moments preserved reconstruct compressed also raised 2 4. hence, quality reconstructed images improved significantly 33.62 38.12 increase bpp 1. increased bpp (3) reduced 1.75in multiple levels: one level, dropping 4 elements bitplane away pixel values dropped elements easily interpolated much loss quality, level two, eight elements dropped reconstructed later level three, size statistical moments reduced. experiments carried standard images varying intensities. cases, proposed method outperforms existing ambtc technique terms psnr bpp.",4 "integrating atlas graph cut methods lv segmentation cardiac cine mri. magnetic resonance imaging (mri) evolved clinical standard-of-care imaging modality cardiac morphology, function assessment, guidance cardiac interventions. applications rely accurate extraction myocardial tissue blood pool imaging data. propose framework left ventricle (lv) segmentation cardiac cine-mri. first, segment lv blood pool using iterative graph cuts, subsequently use information segment myocardium. formulate segmentation procedure energy minimization problem graph subject shape prior obtained label propagation average atlas using affine registration. proposed framework validated 30 patient cardiac cine-mri datasets available stacom lv segmentation challenge yielded fast, robust, accurate segmentation results.",4 "hölder projective divergences. describe framework build distances measuring tightness inequalities, introduce notion proper statistical divergences improper pseudo-divergences. consider h\""older ordinary reverse inequalities, present two novel classes h\""older divergences pseudo-divergences encapsulate special case cauchy-schwarz divergence. report closed-form formulas statistical dissimilarities considering distributions belonging exponential family provided natural parameter space cone (e.g., multivariate gaussians), affine (e.g., categorical distributions). new classes h\""older distances invariant rescaling, thus require distributions normalized. finally, show compute statistical h\""older centroids respect divergences, carry center-based clustering toy experiments set gaussian distributions demonstrate empirically symmetrized h\""older divergences outperform symmetric cauchy-schwarz divergence.",4 "accumulated gradient normalization. work addresses instability asynchronous data parallel optimization. introducing novel distributed optimizer able efficiently optimize centralized model communication constraints. optimizer achieves pushing normalized sequence first-order gradients parameter server. implies magnitude worker delta smaller compared accumulated gradient, provides better direction towards minimum compared first-order gradients, turn also forces possible implicit momentum fluctuations aligned since make assumption workers contribute towards single minima. result, approach mitigates parameter staleness problem effectively since staleness asynchrony induces (implicit) momentum, achieves better convergence rate compared optimizers asynchronous easgd dynsgd, show empirically.",19 "implementing bayesian scheme revising belief commitments. previous work classifying complex ship images [1,2] evolved effort develop software tools building solving generic classification problems. managing uncertainty associated feature data evidence important issue endeavor. bayesian techniques managing uncertainty [7,12,13] proven useful managing several belief maintenance requirements classification problem solving. one requirement need give qualitative explanations believed. pearl [11] addresses need computing calls belief commitment-the probable instantiation hypothesis variables given evidence available. belief commitments computed, straightforward implementation pearl's procedure involves finding analytical solution often difficult optimization problems. describe efficient implementation procedure using tensor products solves problems enumeratively avoids need case case analysis. procedure thereby made practical use general case.",4 "spike timing precision neural error correction: local behavior. effects spike timing precision dynamical behavior error correction spiking neurons investigated. stationary discharges -- phase locked, quasiperiodic, chaotic -- induced simulated neuron presenting pacemaker presynaptic spike trains across model prototypical inhibitory synapse. reduced timing precision modeled jittering presynaptic spike times. aftereffects errors -- communication, missed presynaptic spikes -- determined comparing postsynaptic spike times simulations identical except presence absence errors. results show effects error vary greatly depending ongoing dynamical behavior. case phase lockings, high degree presynaptic spike timing precision provide significantly faster error recovery. non-locked behaviors, isolated missed spikes little discernible aftereffects (or even serve paradoxically reduce uncertainty postsynaptic spike timing), regardless presynaptic imprecision. suggests two possible categories error correction: high-precision locking rapid recovery low-precision non-locked error immunity.",16 "voi-aware mcts. uct, state-of-the art algorithm monte carlo tree search (mcts) games markov decision processes, based ucb1, sampling policy multi-armed bandit problem (mab) minimizes cumulative regret. however, search differs mab mcts usually final ""arm pull"" (the actual move selection) collects reward, rather ""arm pulls"". paper, mcts sampling policy based value information (voi) estimates rollouts suggested. empirical evaluation policy comparison ucb1 uct performed random mab instances well computer go.",4 "estimating success unsupervised image image translation. supervised learning, validation error unbiased estimator generalization (test) error complexity-based generalization bounds abundant, bounds exist learning mapping unsupervised way. result, training gans specifically using gans learning map domains completely unsupervised way, one forced select hyperparameters stopping epoch subjectively examining multiple options. propose novel bound predicting success unsupervised cross domain mapping methods, motivated recently proposed simplicity principle. bound applied expectation, comparing hyperparameters selecting stopping criterion, per sample, order predict success specific cross-domain translation. utility bound demonstrated extensive set experiments employing multiple recent algorithms. code available https://github.com/sagiebenaim/gan_bound .",4 "simple language model based pmi matrix approximations. study, introduce new approach learning language models training estimate word-context pointwise mutual information (pmi), deriving desired conditional probabilities pmi test time. specifically, show minor modifications word2vec's algorithm, get principled language models closely related well-established noise contrastive estimation (nce) based language models. compelling aspect approach models trained simple negative sampling objective function commonly used word2vec learn word embeddings.",4 "optimal bayesian network based solution scheme constrained stochastic on-line equi-partitioning problem. number intriguing decision scenarios revolve around partitioning collection objects optimize application specific objective function. problem generally referred object partitioning problem (opp) known np-hard. consider particularly challenging version opp, namely, stochastic on-line equi-partitioning problem (so-epp). so-epp, target partitioning unknown inferred purely observing on-line sequence object pairs. paired objects belong partition probability $p$ different partitions probability $1-p$, $p$ also unknown. additional complication, partitions required equal cardinality. previously, sub-optimal solution strategies proposed so- epp. paper, propose first optimal solution strategy. brief, scheme propose, bn-epp, founded bayesian network representation so-epp problems. based probabilistic reasoning, able infer underlying object partitioning optimal accuracy. also able simultaneously infer $p$, allowing us accelerate learning object pairs arrive. furthermore, scheme first support arbitrary constraints partitioning (constrained so-epp). optimal, bn-epp provides superior performance compared existing solution schemes. additionally introduce walk-bn-epp, novel walksat inspired algorithm solving large scale bn-epp problems. finally, provide bn-epp based solution problem order picking, representative real-life application bn-epp.",4 "predictive entropy search bayesian optimization unknown constraints. unknown constraints arise many types expensive black-box optimization problems. several methods proposed recently performing bayesian optimization constraints, based expected improvement (ei) heuristic. however, ei lead pathologies used constraints. example, case decoupled constraints---i.e., one independently evaluate objective constraints---ei encounter pathology prevents exploration. additionally, computing ei requires current best solution, may exist none data collected far satisfy constraints. contrast, information-based approaches suffer failure modes. paper, present new information-based method called predictive entropy search constraints (pesc). analyze performance pesc show compares favorably ei-based approaches synthetic benchmark problems, well several real-world examples. demonstrate pesc effective algorithm provides promising direction towards unified solution constrained bayesian optimization.",19 "sparse representation multivariate extremes applications anomaly ranking. extremes play special role anomaly detection. beyond inference simulation purposes, probabilistic tools borrowed extreme value theory (evt), angular measure, also used design novel statistical learning methods anomaly detection/ranking. paper proposes new algorithm based multivariate evt learn rank observations high dimensional space respect degree 'abnormality'. procedure relies original dimension-reduction technique extreme domain possibly produces sparse representation multivariate extremes allows gain insight dependence structure thereof, escaping curse dimensionality. representation output unsupervised methodology propose combined anomaly detection technique tailored non-extreme data. performs linearly dimension almost linearly data (in o(dn log n)), fits large scale problems. approach paper novel evt never used multivariate version field anomaly detection. illustrative experimental results provide strong empirical evidence relevance approach.",19 "issues communication game. interaction autonomous agents, communication analyzed game-theoretic terms. meaning game proposed formalize core intended communication sender sends message receiver attempts infer meaning intended sender. basic issues involved game natural language communication discussed, salience, grammaticality, common sense, common belief, together demonstration feasibility game-theoretic account language.",4 "off-policy learning eligibility traces: survey. framework markov decision processes, off-policy learning, problem learning linear approximation value function fixed policy one trajectory possibly generated policy. briefly review on-policy learning algorithms literature (gradient-based least-squares-based), adopting unified algorithmic view. then, highlight systematic approach adapting off-policy learning eligibility traces. leads known algorithms - off-policy lstd(\lambda), lspe(\lambda), td(\lambda), tdc/gq(\lambda) - suggests new extensions - off-policy fpkf(\lambda), brm(\lambda), gbrm(\lambda), gtd2(\lambda). describe comprehensive algorithmic derivation algorithms recursive memory-efficent form, discuss known convergence properties illustrate relative empirical behavior garnet problems. experiments suggest standard algorithms off-policy lstd(\lambda)/lspe(\lambda) - td(\lambda) feature space dimension large least-squares approach - perform best.",4 "discriminative clustering relative constraints. study problem clustering relative constraints, constraint specifies relative similarities among instances. particular, constraint $(x_i, x_j, x_k)$ acquired posing query: instance $x_i$ similar $x_j$ $x_k$? consider scenario answers queries based underlying (but unknown) class concept, aim discover via clustering. different existing methods consider constraints derived yes answers, also incorporate know responses. introduce discriminative clustering method relative constraints (dcrc) assumes natural probabilistic relationship instances, underlying cluster memberships, observed constraints. objective maximize model likelihood given constraints, meantime enforce cluster separation cluster balance also making use unlabeled instances. evaluated proposed method using constraints generated ground-truth class labels, (noisy) human judgments user study. experimental results demonstrate: 1) usefulness relative constraints, particular know answers considered; 2) improved performance proposed method state-of-the-art methods utilize either relative pairwise constraints; 3) robustness method presence noisy constraints, provided human judgement.",4 "gated recurrent networks seizure detection. recurrent neural networks (rnns) sophisticated units implement gating mechanism emerged powerful technique modeling sequential signals speech electroencephalography (eeg). latter focus paper. significant big data resource, known tuh eeg corpus (tueeg), recently become available eeg research, creating unique opportunity evaluate recurrent units task seizure detection. study, compare two types recurrent units: long short-term memory units (lstm) gated recurrent units (gru). evaluated using state art hybrid architecture integrates convolutional neural networks (cnns) rnns. also investigate variety initialization methods show initialization crucial since poorly initialized networks cannot trained. furthermore, explore regularization convolutional gated recurrent networks address problem overfitting. experiments revealed convolutional lstm networks achieve significantly better performance convolutional gru networks. convolutional lstm architecture proper initialization regularization delivers 30% sensitivity 6 false alarms per 24 hours.",6 "machine learning methods histopathological image analysis. abundant accumulation digital histopathological images led increased demand analysis, computer-aided diagnosis using machine learning techniques. however, digital pathological images related tasks issues considered. mini-review, introduce application digital pathological image analysis using machine learning algorithms, address problems specific analysis, propose possible solutions.",4 "propositional logic plausible reasoning: uniqueness theorem. consider question extending propositional logic logic plausible reasoning, posit four requirements extension satisfy. requirement property classical propositional logic preserved extended logic; such, requirements simpler less problematic used cox's theorem variants. cox's theorem, requirements imply extended logic must isomorphic (finite-set) probability theory. also obtain specific numerical values probabilities, recovering classical definition probability theorem, truth assignments satisfy premise playing role ""possible cases.""",4 "phased exploration greedy exploitation stochastic combinatorial partial monitoring games. partial monitoring games repeated games learner receives feedback might different adversary's move even reward gained learner. recently, general model combinatorial partial monitoring (cpm) games proposed \cite{lincombinatorial2014}, learner's action space exponentially large adversary samples moves bounded, continuous space, according fixed distribution. paper gave confidence bound based algorithm (gcb) achieves $o(t^{2/3}\log t)$ distribution independent $o(\log t)$ distribution dependent regret bounds. implementation algorithm depends two separate offline oracles distribution dependent regret additionally requires existence unique optimal action learner. adopting cpm model, first contribution phased exploration greedy exploitation (pege) algorithmic framework problem. different algorithms within framework achieve $o(t^{2/3}\sqrt{\log t})$ distribution independent $o(\log^2 t)$ distribution dependent regret respectively. crucially, framework needs simpler ""argmax"" oracle gcb distribution dependent regret require existence unique optimal action. second contribution another algorithm, pege2, combines gap estimation pege algorithm, achieve $o(\log t)$ regret bound, matching gcb guarantee removing dependence size learner's action space. however, like gcb, pege2 requires access offline oracles existence unique optimal action. finally, discuss algorithm efficiently applied cpm problem practical interest: namely, online ranking feedback top.",4 "prior-based hierarchical segmentation highlighting structures interest. image segmentation process partitioning image set meaningful regions according criteria. hierarchical segmentation emerged major trend regard favors emergence important regions different scales. hand, many methods allow us prior information position structures interest images. paper, present versatile hierarchical segmentation method takes account prior spatial information outputs hierarchical segmentation emphasizes contours regions interest preserving important structures image. several applications presented illustrate method versatility efficiency.",4 "enhancing use case points estimation method using soft computing techniques. software estimation crucial task software engineering. software estimation encompasses cost, effort, schedule, size. importance software estimation becomes critical early stages software life cycle details software revealed yet. several commercial non-commercial tools exist estimate software early stages. software effort estimation methods require software size one important metric inputs consequently, software size estimation early stages becomes essential. one approaches used two decades early size effort estimation called use case points. use case points method relies use case diagram estimate size effort software projects. although use case points method widely used, limitations might adversely affect accuracy estimation. paper presents techniques using fuzzy logic neural networks improve accuracy use case points method. results showed improvement 22% obtained using proposed approach.",4 "graph learning filtered signals: graph system diffusion kernel identification. paper introduces novel graph signal processing framework building graph-based models classes filtered signals. framework, graph-based modeling formulated graph system identification problem, goal learn weighted graph (a graph laplacian matrix) graph-based filter (a function graph laplacian matrices). order solve proposed problem, algorithm developed jointly identify graph graph-based filter (gbf) multiple signal/data observations. algorithm valid assumption gbfs one-to-one functions. proposed approach applied learn diffusion (heat) kernels, popular various fields modeling diffusion processes. addition, specific choices graph-based filters, proposed problem reduces graph laplacian estimation problem. experimental results demonstrate proposed algorithm outperforms current state-of-the-art methods. also implement framework real climate dataset modeling temperature signals.",4 "sample must: optimal functional sampling. examine fundamental problem models various active sampling setups, network tomography. analyze sampling multivariate normal distribution unknown expectation needs estimated: setup possible sample distribution given set linear functionals, difficulty addressed optimally select combinations achieve low estimation error. although problem heart field optimal design, efficient solutions case many functionals exist. present bounds efficient sub-optimal solution problem structured sets binary functionals induced graph walks.",19 "generating thematic chinese poetry using conditional variational autoencoders hybrid decoders. computer poetry generation first step towards computer writing. writing must theme. current approaches using sequence-to-sequence models attention often produce non-thematic poems. present novel conditional variational autoencoder hybrid decoder adding deconvolutional neural networks general recurrent neural networks fully learn topic information via latent variables. approach significantly improves relevance generated poems representing line poem context-sensitive manner also holistic way highly related given keyword learned topic. proposed augmented word2vec model improves rhythm symmetry. tests show generated poems approach mostly satisfying regulated rules consistent themes, 73.42% receive overall score less 3 (the highest score 5).",4 "geometric dirichlet means algorithm topic inference. propose geometric algorithm topic learning inference built convex geometry topics arising latent dirichlet allocation (lda) model nonparametric extensions. end study optimization geometric loss function, surrogate lda's likelihood. method involves fast optimization based weighted clustering procedure augmented geometric corrections, overcomes computational statistical inefficiencies encountered techniques based gibbs sampling variational inference, achieving accuracy comparable gibbs sampler. topic estimates produced method shown statistically consistent conditions. algorithm evaluated extensive experiments simulated real data.",19 "job detection twitter. report, propose new application twitter data called \textit{job detection}. identify people's job category based tweets. preliminary work, limited task identify workers job holders. used compared simple bag words model document representation based skip-gram model. results show model based skip-gram, achieves 76\% precision 82\% recall.",4 "attention-based models text-dependent speaker verification. attention-based models recently shown great performance range tasks, speech recognition, machine translation, image captioning due ability summarize relevant information expands entire length input sequence. paper, analyze usage attention mechanisms problem sequence summarization end-to-end text-dependent speaker recognition system. explore different topologies variants attention layer, compare different pooling methods attention weights. ultimately, show attention-based models improves equal error rate (eer) speaker verification system relatively 14% compared non-attention lstm baseline model.",6 "serious?: rhetorical questions sarcasm social media dialog. effective models social dialog must understand broad range rhetorical figurative devices. rhetorical questions (rqs) type figurative language whose aim achieve pragmatic goal, structuring argument, persuasive, emphasizing point, ironic. computational models forms figurative language, rhetorical questions received little attention date. expand small dataset previous work, presenting corpus 10,270 rqs debate forums twitter represent different discourse functions. show clearly distinguish rqs sincere questions (0.76 f1). show rqs used sarcastically non-sarcastically, observing non-sarcastic (other) uses rqs frequently argumentative forums, persuasive tweets. present experiments distinguish uses rqs using svm lstm models represent linguistic features post-level context, achieving results high 0.76 f1 ""sarcastic"" 0.77 f1 ""other"" forums, 0.83 f1 ""sarcastic"" ""other"" tweets. supplement quantitative experiments in-depth characterization linguistic variation rqs.",4 "introducing elitist black-box models: elitist selection weaken performance evolutionary algorithms?. black-box complexity theory provides lower bounds runtime black-box optimizers like evolutionary algorithms serves inspiration design new genetic algorithms. several black-box models covering different classes algorithms exist, highlighting different aspect algorithms considerations. work add existing black-box notions new \emph{elitist black-box model}, algorithms required base decisions solely (a fixed number of) best search points sampled far. model combines features ranking-based memory-restricted black-box models elitist selection. provide several examples elitist black-box complexity exponentially larger respective complexities previous black-box models, thus showing elitist black-box complexity much closer runtime typical evolutionary algorithms. also introduce concept $p$-monte carlo black-box complexity, measures time takes optimize problem failure probability $p$. even small~$p$, $p$-monte carlo black-box complexity function class $\mathcal f$ smaller exponential factor typically regarded las vegas complexity (which measures \emph{expected} time takes optimize $\mathcal f$).",4 "medical image analysis using convolutional neural networks: review. medical image analysis science analyzing solving medical problems using different image analysis techniques affective efficient extraction information. emerged one top research area field engineering medicine. recent years witnessed rapid use machine learning algorithms medical image analysis. machine learning techniques used extract compact information improved performance medical image analysis system, compared traditional methods use extraction handcrafted features. deep learning breakthrough machine learning techniques overwhelmed field pattern recognition computer vision research providing state-of-the-art results. deep learning provides different machine learning algorithms model high level data abstractions rely handcrafted features. recently, deep learning methods utilizing deep convolutional neural networks applied medical image analysis providing promising results. application area covers whole spectrum medical image analysis including detection, segmentation, classification, computer aided diagnosis. paper presents review state-of-the-art convolutional neural network based techniques used medical image analysis.",4 "clickbait detection using word embeddings. clickbait pejorative term describing web content aimed generating online advertising revenue, especially expense quality accuracy, relying sensationalist headlines eye-catching thumbnail pictures attract click-throughs encourage forwarding material online social networks. use distributed word representations words title features identify clickbaits online news media. train machine learning model using linear regression predict cickbait score given tweet. methods achieve f1-score 64.98\% mse 0.0791. compared methods, method simple, fast train, require extensive feature engineering yet moderately effective.",4 microwave imaging enhancement technique noisy synthetic data. inverse iterative algorithm microwave imaging based moment method solution presented here. iterative scheme developed constrained optimization technique certain converge. different mesh size model used overcome inverse crime. synthetic data receivers contaminated different percentage noise. ill-posedness problem solved levenberg-marquardt method. algorithm applied synthetic data reconstructed image enhanced image enhancement technique,4 "geometric proof calibration. provide yet another proof existence calibrated forecasters; two merits. first, valid arbitrary finite number outcomes. second, short simple follows direct application blackwell's approachability theorem carefully chosen vector-valued payoff function convex target set. proof captures essence existing proofs based approachability (e.g., proof foster, 1999 case binary outcomes) highlights intrinsic connection approachability calibration.",19 "videostory embeddings recognize events examples scarce. paper aims event recognition video examples scarce even completely absent. key challenging setting semantic video representation. rather building representation individual attribute detectors annotations, propose learn entire representation freely available web videos descriptions using embedding video features term vectors. proposed embedding, call videostory, correlations terms utilized learn effective representation optimizing joint objective balancing descriptiveness predictability.we show learning videostory using multimodal predictability loss, including appearance, motion audio features, results better predictable representation. also propose variant videostory recognize event video important terms text query introducing term sensitive descriptiveness loss. experiments three challenging collections web videos nist trecvid multimedia event detection columbia consumer videos datasets demonstrate: i) advantages videostory representations using attributes alternative embeddings, ii) benefit fusing video modalities embedding common strategies, iii) complementarity term sensitive descriptiveness multimodal predictability event recognition without examples. abilities improve predictability upon underlying video feature time maximizing semantic descriptiveness, videostory leads state-of-the-art accuracy few- zero-example recognition events video.",4 "biological gradient descent prediction combination stdp homeostatic plasticity. identifying, formalizing combining biological mechanisms implement known brain functions, prediction, main aspect current research theoretical neuroscience. letter, mechanisms spike timing dependent plasticity (stdp) homeostatic plasticity, combined original mathematical formalism, shown shape recurrent neural networks predictors. following rigorous mathematical treatment, prove implement online gradient descent distance network activity stimuli. convergence equilibrium, network spontaneously reproduce predict stimuli, suffer bifurcation issues usually encountered learning recurrent neural networks.",16 "understanding effective receptive field deep convolutional neural networks. study characteristics receptive fields units deep convolutional networks. receptive field size crucial issue many visual tasks, output must respond large enough areas image capture information large objects. introduce notion effective receptive field, show gaussian distribution occupies fraction full theoretical receptive field. analyze effective receptive field several architecture designs, effect nonlinear activations, dropout, sub-sampling skip connections it. leads suggestions ways address tendency small.",4 "small moving window calibration models soft sensing processes limited history. five simple soft sensor methodologies two update conditions compared two experimentally-obtained datasets one simulated dataset. soft sensors investigated moving window partial least squares regression (and recursive variant), moving window random forest regression, mean moving window $y$, novel random forest partial least squares regression ensemble (rf-pls), used small sample sizes rapidly placed online. found that, two datasets studied, small window sizes led lowest prediction errors moving window methods studied. majority datasets studied, rf-pls calibration method offered lowest one-step-ahead prediction errors compared methods, demonstrated greater predictive stability larger time delays moving window pls alone. found random forest rf-pls methods adequately modeled datasets feature purely monotonic increases property values, methods performed poorly moving window pls models one dataset purely monotonic property values. data dependent findings presented discussed.",19 probabilistic transmission expansion planning methodology based roulette wheel selection social welfare. new probabilistic methodology transmission expansion planning (tep) require priori specification new/additional transmission capacities uses concept social welfare proposed. two new concepts introduced paper: (i) roulette wheel methodology used calculate capacity new transmission lines (ii) load flow analysis used calculate expected demand served (edns). overall methodology implemented modified ieee 5-bus test system. simulations show important result: addition new transmission lines sufficient minimize edns.,4 "parcellation fmri datasets ica pls-a data driven approach. inter-subject parcellation functional magnetic resonance imaging (fmri) data based standard general linear model (glm)and spectral clustering recently proposed means alleviate issues associated spatial normalization fmri. however, appeal, glm-based parcellation approach introduces biases, form priori knowledge shape hemodynamic response function (hrf) task-related signal changes, subject behaviour task. paper, introduce data-driven version spectral clustering parcellation, based independent component analysis (ica) partial least squares (pls) instead glm. first, number independent components automatically selected. seed voxels obtained associated ica maps compute pls latent variables fmri signal seed voxels (which covers regional variations hrf) principal components signal across voxels. finally, parcellate subjects data spectral clustering pls latent variables. present results application proposed method single-subject multi-subject fmri datasets. preliminary experimental results, evaluated intra-parcel variance glm t-values pls derived t-values, indicate data-driven approach offers improvement terms parcellation accuracy glm based techniques.",4 "cheap bandits. consider stochastic sequential learning problems learner observe \textit{average reward several actions}. setting interesting many applications involving monitoring surveillance, set actions observe represent (geographical) area. importance setting applications, actually \textit{cheaper} observe average reward group actions rather reward single action. show reward \textit{smooth} given graph representing neighboring actions, maximize cumulative reward learning \textit{minimizing sensing cost}. paper propose cheapucb, algorithm matches regret guarantees known algorithms setting time guarantees linear cost them. by-product analysis, establish $\omega(\sqrt{dt})$ lower bound cumulative regret spectral bandits class graphs effective dimension $d$.",4 "play: retrieval video segments using natural-language queries. paper, propose new approach retrieval video segments using natural language queries. unlike previous approaches concept-based methods rule-based structured models, proposed method uses image captioning model construct sentential queries visual information. detail, approach exploits multiple captions generated visual features image `densecap'. then, similarities captions adjacent images calculated, used track semantically similar captions multiple frames. besides introducing novel idea 'tracking captioning', proposed method one first approaches uses language generation model learned neural networks construct semantic query describing relations properties visual information. evaluate effectiveness approach, created new evaluation dataset, contains 348 segments scenes 20 movie-trailers. quantitative qualitative evaluation, show method effective retrieval video segments using natural language queries.",4 "classification polar-thermal eigenfaces using multilayer perceptron human face recognition. paper presents novel approach handle challenges face recognition. work thermal face images considered, minimizes affect illumination changes occlusion due moustache, beards, adornments etc. proposed approach registers training testing thermal face images polar coordinate, capable handle complicacies introduced scaling rotation. polar images projected eigenspace finally classified using multi-layer perceptron. experiments used object tracking classification beyond visible spectrum (otcbvs) database benchmark thermal face images. experimental results show proposed approach significantly improves verification identification performance success rate 97.05%.",4 "maximum entropy deep inverse reinforcement learning. paper presents general framework exploiting representational capacity neural networks approximate complex, nonlinear reward functions context solving inverse reinforcement learning (irl) problem. show context maximum entropy paradigm irl lends naturally efficient training deep architectures. test time, approach leads computational complexity independent number demonstrations, makes especially well-suited applications life-long learning scenarios. approach achieves performance commensurate state-of-the-art existing benchmarks exceeding alternative benchmark based highly varying reward structures. finally, extend basic architecture - equivalent simplified subclass fully convolutional neural networks (fcnns) width one - include larger convolutions order eliminate dependency precomputed spatial features work raw input representations.",4 "effectiveness least squares generative adversarial networks. unsupervised learning generative adversarial networks (gans) proven hugely successful. regular gans hypothesize discriminator classifier sigmoid cross entropy loss function. however, found loss function may lead vanishing gradients problem learning process. overcome problem, propose paper least squares generative adversarial networks (lsgans) adopt least squares loss function discriminator. show minimizing objective function lsgan yields minimizing pearson $\chi^2$ divergence. also present theoretical analysis properties lsgans $\chi^2$ divergence. two benefits lsgans regular gans. first, lsgans able generate higher quality images regular gans. second, lsgans perform stable learning process. evaluating image quality, train lsgans several datasets including lsun cat dataset, experimental results show images generated lsgans better quality ones generated regular gans. furthermore, evaluate stability lsgans two groups. one compare lsgans regular gans without gradient penalty. conduct three experiments, including gaussian mixture distribution, difficult architectures, new proposed method --- datasets small variance, illustrate stability lsgans. one compare lsgans gradient penalty wgans gradient penalty (wgans-gp). experimental results show lsgans gradient penalty succeed training difficult architectures used wgans-gp, including 101-layer resnet.",4 "unsupervised learning monocular depth estimation visual odometry deep feature reconstruction. despite learning based methods showing promising results single view depth estimation visual odometry, existing approaches treat tasks supervised manner. recent approaches single view depth estimation explore possibility learning without full supervision via minimizing photometric error. paper, explore use stereo sequences learning depth visual odometry. use stereo sequences enables use spatial (between left-right pairs) temporal (forward backward) photometric warp error, constrains scene depth camera motion common, real-world scale. test time framework able estimate single view depth two-view odometry monocular sequence. also show improve standard photometric warp loss considering warp deep features. show extensive experiments that: (i) jointly training single view depth visual odometry improves depth prediction additional constraint imposed depths achieves competitive results visual odometry; (ii) deep feature-based warping loss improves upon simple photometric warp loss single view depth estimation visual odometry. method outperforms existing learning based methods kitti driving dataset tasks. source code available https://github.com/huangying-zhan/depth-vo-feat",4 "online learning switching costs adaptive adversaries. study power different types adaptive (nonoblivious) adversaries setting prediction expert advice, full-information bandit feedback. measure player's performance using new notion regret, also known policy regret, better captures adversary's adaptiveness player's behavior. setting losses allowed drift, characterize ---in nearly complete manner--- power adaptive adversaries bounded memories switching costs. particular, show switching costs, attainable rate bandit feedback $\widetilde{\theta}(t^{2/3})$. interestingly, rate significantly worse $\theta(\sqrt{t})$ rate attainable switching costs full-information case. via novel reduction experts bandits, also show bounded memory adversary force $\widetilde{\theta}(t^{2/3})$ regret even full information case, proving switching costs easier control bounded memory adversaries. lower bounds rely new stochastic adversary strategy generates loss processes strong dependencies.",4 "quality expectations machine translation. machine translation (mt) deployed range use-cases millions people daily basis. should, therefore, doubt utility mt. however, everyone convinced mt useful, especially productivity enhancer human translators. chapter, address issue, describing mt currently deployed, output evaluated could enhanced, especially mt quality improves. central issues acceptance longer single 'gold standard' measure quality, situation mt deployed needs borne mind, especially respect expected 'shelf-life' translation itself.",4 "technical report: image captioning semantically similar images. report presents submission ms coco captioning challenge 2015. method uses convolutional neural network activations embedding find semantically similar images. images, typical caption selected based unigram frequencies. although method received low scores automated evaluation metrics human assessed average correctness, competitive ratio captions pass turing test assessed better equal human captions.",4 "compressed model residual cnds. convolutional neural networks achieved great success recent years. although, way maximize performance convolutional neural networks still beginning. furthermore, optimization size time need train convolutional neural networks far away reaching researcher's ambition. paper, proposed new convolutional neural network combined several techniques boost optimization convolutional neural network aspects speed size. used previous model residual-cnds (rescnds), solved problems slower convergence, overfitting, degradation, compressed it. outcome model called residual-squeeze-cnds (ressqucnds), demonstrated sold technique add residual learning model compressing convolutional neural networks. model compressing adapted squeezenet model, model generalizable, applied almost neural network model, fully integrated residual learning, addresses problem degradation successfully. proposed model trained large-scale mit places365-standard scene datasets, backing hypothesis new compressed model inherited best previous rescnds8 model, almost get accuracy validation top-1 top-5 87.64% smaller size 13.33% faster training time.",4 "bipartite graph matching keyframe summary evaluation. keyframe summary, ""static storyboard"", collection frames video designed summarise semantic content. many algorithms proposed extract summaries automatically. best evaluate outputs important little-discussed question. review current methods matching frames two summaries formalism graph theory. analysis revealed different behaviours methods, illustrate number case studies. based results, recommend greedy matching algorithm due kannappan et al.",4 "specious rules: efficient effective unifying method removing misleading uninformative patterns association rule mining. present theoretical analysis suite tests procedures addressing broad class redundant misleading association rules call \emph{specious rules}. specious dependencies, also known \emph{spurious}, \emph{apparent}, \emph{illusory associations}, refer well-known phenomenon marginal dependencies merely products interactions variables disappear conditioned variables. extreme example yule-simpson's paradox two variables present positive dependence marginal contingency table negative partial tables defined different levels confounding factor. accepted wisdom data nontrivial dimensionality infeasible control exponentially many possible confounds nature. paper, consider problem specious dependencies context statistical association rule mining. define specious rules show offer unifying framework covers many types previously proposed redundant misleading association rules. theoretical analysis, introduce practical algorithms detecting pruning specious association rules efficiently many key goodness measures, including mutual information exact hypergeometric probabilities. demonstrate procedure greatly reduces number associations discovered, providing elegant effective solution problem association mining discovering large numbers misleading redundant rules.",4 "lipschitz exploration-exploitation scheme bayesian optimization. problem optimizing unknown costly-to-evaluate functions studied long time context bayesian optimization. algorithms field aim find optimizer function asking function evaluations locations carefully selected based posterior model. paper, assume unknown function lipschitz continuous. leveraging lipschitz property, propose algorithm distinct exploration phase followed exploitation phase. exploration phase aims select samples shrink search space much possible. exploitation phase focuses reduced search space selects samples closest optimizer. considering expected improvement (ei) baseline, empirically show proposed algorithm significantly outperforms ei.",4 "evaluation deep learning abstract image classification dataset. convolutional neural networks become state art methods image classification last couple years. perform better human subjects many image classification datasets. datasets based notion concrete classes (i.e. images classified type object image). paper present novel image classification dataset, using abstract classes, easy solve humans, variations challenging cnns. classification performance popular cnn architectures evaluated dataset variations dataset might interesting research identified.",4 "simulating spiking neural p systems without delays using gpus. present paper work regarding simulating type p system known spiking neural p system (snp system) using graphics processing units (gpus). gpus, architectural optimization parallel computations, well-suited highly parallelizable problems. due advent general purpose gpu computing recent years, gpus limited graphics video processing alone, include computationally intensive scientific mathematical applications well. moreover p systems, including snp systems, inherently maximally parallel computing models whose inspirations taken functioning dynamics living cell. particular, snp systems try give modest formal representation special type cell known neuron interactions one another. nature snp systems allowed representation matrices, crucial step simulating highly parallel devices gpus. highly parallel nature snp systems necessitate use hardware intended parallel computations. simulation algorithms, design considerations, implementation presented. finally, simulation results, observations, analyses using snp system generates numbers $\mathbb n$ - {1} discussed, well recommendations future work.",4 "distributed algorithm training nonlinear kernel machines. paper concerns distributed training nonlinear kernel machines map-reduce. show re-formulation nystr\""om approximation based solution solved using gradient based techniques well suited this, especially necessary work large number basis points. main advantages approach are: avoidance computing pseudo-inverse kernel sub-matrix corresponding basis points; simplicity efficiency distributed part computations; and, friendliness stage-wise addition basis points. implement method using allreduce tree hadoop demonstrate value large benchmark datasets.",4 "analysis random algorithm estimating matchings. counting number matchings bipartite graph transformed calculating permanent matrix obtained extended bipartite graph yan huo, rasmussen presents simple approach (rm) approximate permanent, yields critical ratio o($n\omega(n)$) almost 0-1 matrices, provided simple promising practical way compute #p-complete problem. paper, performance method shown applied compute matchings based transformation. critical ratio proved large certain probability, owning increasing factor larger polynomial $n$ even sense almost 0-1 matrices. hence, rm fails work well counting matchings via computing permanent matrix. words, must carefully utilize known methods estimating permanent count matchings transformation.",4 "incremental construction minimal acyclic finite-state automata. paper, describe new method constructing minimal, deterministic, acyclic finite-state automata set strings. traditional methods consist two phases: first construct trie, second one minimize it. approach construct minimal automaton single phase adding new strings one one minimizing resulting automaton on-the-fly. present general algorithm well specialization relies upon lexicographical ordering input strings.",4 "constraints, lazy constraints, propagators asp solving: empirical analysis. answer set programming (asp) well-established declarative paradigm. one successes asp availability efficient systems. state-of-the-art systems based ground+solve approach. applications approach infeasible grounding one constraints expensive. paper, systematically compare alternative strategies avoid instantiation problematic constraints, based custom extensions solver. results real synthetic benchmarks highlight strengths weaknesses different strategies. (under consideration acceptance tplp, iclp 2017 special issue.)",4 "building rules top ontologies semantic web inductive logic programming. building rules top ontologies ultimate goal logical layer semantic web. aim ad-hoc mark-up language layer currently discussion. intended follow tradition hybrid knowledge representation reasoning systems $\mathcal{al}$-log integrates description logic $\mathcal{alc}$ function-free horn clausal language \textsc{datalog}. paper consider problem automating acquisition rules semantic web. propose general framework rule induction adopts methodological apparatus inductive logic programming relies expressive deductive power $\mathcal{al}$-log. framework valid whatever scope induction (description vs. prediction) is. yet, illustrative purposes, also discuss instantiation framework aims description turns useful ontology refinement. keywords: inductive logic programming, hybrid knowledge representation reasoning systems, ontologies, semantic web. note: appear theory practice logic programming (tplp)",4 "lost time: temporal analytics long-term video surveillance. video surveillance well researched area study substantial work done aspects object detection, tracking behavior analysis. abundance video data captured long period time, understand patterns human behavior scene dynamics data-driven temporal analytics. work, propose two schemes perform descriptive predictive analytics long-term video surveillance data. generate heatmap footmap visualizations describe spatially pooled trajectory patterns respect time location. also present two approaches anomaly prediction day-level granularity: trajectory-based statistical approach, time-series based approach. experimentation one year data single camera demonstrates ability uncover interesting insights scene predict anomalies reasonably well.",4 "surveillance video parsing single frame supervision. surveillance video parsing, segments video frames several labels, e.g., face, pants, left-leg, wide applications. however,pixel-wisely annotating frames tedious inefficient. paper, develop single frame video parsing (svp) method requires one labeled frame per video training stage. parse one particular frame, video segment preceding frame jointly considered. svp (1) roughly parses frames within video segment, (2) estimates optical flow frames (3) fuses rough parsing results warped optical flow produce refined parsing result. three components svp, namely frame parsing, optical flow estimation temporal fusion integrated end-to-end manner. experimental results two surveillance video datasets show superiority svp state-of-the-arts.",4 "solving problem k parameter knn classifier using ensemble learning approach. paper presents new solution choosing k parameter k-nearest neighbor (knn) algorithm, solution depending idea ensemble learning, weak knn classifier used time different k, starting one square root size training set. results weak classifiers combined using weighted sum rule. proposed solution tested compared solutions using group experiments real life problems. experimental results show proposed classifier outperforms traditional knn classifier uses different number neighbors, competitive classifiers, promising classifier strong potential wide range applications.",4 "video genome. fast evolution internet technologies led explosive growth video data available public domain created unprecedented challenges analysis, organization, management, control content. problems encountered video analysis identifying video large database (e.g. detecting pirated content youtube), putting together video fragments, finding similarities common ancestry different versions video, analogous counterpart problems genetic research analysis dna protein sequences. paper, exploit analogy genetic sequences videos propose approach video analysis motivated genomic research. representing video information video dna sequences applying bioinformatic algorithms allows search, match, compare videos large-scale databases. show application content-based metadata mapping versions annotated video.",4 "pruning variable selection ensembles. context variable selection, ensemble learning gained increasing interest due great potential improve selection accuracy reduce false discovery rate. novel ordering-based selective ensemble learning strategy designed paper obtain smaller accurate ensembles. particular, greedy sorting strategy proposed rearrange order members included integration process. stopping fusion process early, smaller subensemble higher selection accuracy obtained. importantly, sequential inclusion criterion reveals fundamental strength-diversity trade-off among ensemble members. taking stability selection (abbreviated stabsel) example, experiments conducted simulated real-world data examine performance novel algorithm. experimental results demonstrate pruned stabsel generally achieves higher selection accuracy lower false discovery rates stabsel several benchmark methods.",19 "theory optimizing pseudolinear performance measures: application f-measure. non-linear performance measures widely used evaluation learning algorithms. example, $f$-measure commonly used performance measure classification problems machine learning information retrieval community. study theoretical properties subset non-linear performance measures called pseudo-linear performance measures includes $f$-measure, \emph{jaccard index}, among many others. establish many notions $f$-measures \emph{jaccard index} pseudo-linear functions per-class false negatives false positives binary, multiclass multilabel classification. based observation, present general reduction performance measure optimization problem cost-sensitive classification problem unknown costs. propose algorithm provable guarantees obtain approximately optimal classifier $f$-measure solving series cost-sensitive classification problems. strength analysis valid dataset class classifiers, extending existing theoretical results pseudo-linear measures, asymptotic nature. also establish multi-objective nature $f$-score maximization problem linking algorithm weighted-sum approach used multi-objective optimization. present numerical experiments illustrate relative importance cost asymmetry thresholding learning linear classifiers various $f$-measure optimization tasks.",4 "arguments effectiveness human problem solving. question humans solve problem addressed extensively. however, direct study effectiveness process seems overlooked. paper, address issue effectiveness human problem solving: analyze effectiveness comes cognitive mechanisms heuristics involved. results based optimal probabilistic problem solving strategy appeared solomonoff paper general problem solving system. provide arguments certain set cognitive mechanisms heuristics drive human problem solving similar manner optimal solomonoff strategy. results presented paper serve cognitive psychology better understanding human problem solving processes well artificial intelligence designing human-like agents.",4 "infinite hierarchical factor regression model. propose nonparametric bayesian factor regression model accounts uncertainty number factors, relationship factors. accomplish this, propose sparse variant indian buffet process couple hierarchical model factors, based kingman's coalescent. apply model two problems (factor analysis factor regression) gene-expression data analysis.",4 "plausibility probability?(revised 2003, 2015). present examine result related uncertainty reasoning, namely certain plausibility space cox's type uniquely embedded minimal ordered field. this, although purely mathematical result, claimed imply every rational method reason uncertainty must based sets extended probability distributions, extended probability standard probability extended infinitesimals. claim must supported argumentation non-mathematical type, however, since pure mathematics tell us anything world. propose one argumentation, relate results literature uncertainty statistics. added retrospective section discuss developments area regarding countable additivity, partially ordered domains robustness, philosophical stances cox/jaynes approach since 2003. also show general partially ordered plausibility calculus embeddable ring represented set extended probability distributions or, algebraic terms, subdirect sum ordered fields. words, robust bayesian approach universal. result exemplified relating dempster-shafer's evidence theory robust bayesian analysis.",4 "speaker diarization lstm. many years, i-vector based audio embedding techniques dominant approach speaker verification speaker diarization applications. however, mirroring rise deep learning various domains, neural network based audio embeddings, also known d-vectors, consistently demonstrated superior speaker verification performance. paper, build success d-vector based speaker verification systems develop new d-vector based approach speaker diarization. specifically, combine lstm-based d-vector audio embeddings recent work non-parametric clustering obtain state-of-the-art speaker diarization system. system evaluated three standard public datasets, suggesting d-vector based diarization systems offer significant advantages traditional i-vector based systems. achieved 12.0% diarization error rate nist sre 2000 callhome, model trained out-of-domain data voice search logs.",6 "design analysis multiple view descriptors. propose extension popular descriptors based gradient orientation histograms (hog, computed single image) multiple views. hinges interpreting hog conditional density space sampled images, effects nuisance factors viewpoint illumination marginalized. however, marginalization performed respect coarse approximation underlying distribution. extension leverages fact multiple views scene allow separating intrinsic nuisance variability, thus afford better marginalization latter. result descriptor complexity single-view hog, compared manner, exploits multiple views better trade insensitivity nuisance variability specificity intrinsic variability. also introduce novel multi-view wide-baseline matching dataset, consisting mixture real synthetic objects ground truthed camera motion dense three-dimensional geometry.",4 "context-sensitive super-resolution fast fetal magnetic resonance imaging. 3d magnetic resonance imaging (mri) often trade-off fast low-resolution image acquisition highly detailed slow image acquisition. fast imaging required targets move avoid motion artefacts. particular difficult fetal mri. spatially independent upsampling techniques, state-of-the-art address problem, error prone disregard contextual information. paper propose context-sensitive upsampling method based residual convolutional neural network model learns organ specific appearance adopts semantically input data allowing generation high resolution images sharp edges fine scale detail. making contextual decisions appearance shape, present different parts image, gain maximum structural detail similar contrast provided high-resolution data. experiment $145$ fetal scans show approach yields increased psnr $1.25$ $db$ applied under-sampled fetal data \emph{cf.} baseline upsampling. furthermore, method yields increased psnr $1.73$ $db$ utilizing under-sampled fetal data perform brain volume reconstruction motion corrupted captured data.",4 "learning hierarchical information flow recurrent neural modules. propose thalnet, deep learning model inspired neocortical communication via thalamus. model consists recurrent neural modules send features routing center, endowing modules flexibility share features multiple time steps. show model learns route information hierarchically, processing input data chain modules. observe common architectures, feed forward neural networks skip connections, emerging special cases architecture, novel connectivity patterns learned text8 compression task. model outperforms standard recurrent neural networks several sequential benchmarks.",4 "image captioning classification dangerous situations. current robot platforms employed collaborate humans wide range domestic industrial tasks. environments require autonomous systems able classify communicate anomalous situations fires, injured persons, car accidents; generally, potentially dangerous situation humans. paper introduce anomaly detection dataset purpose robot applications well design implementation deep learning architecture classifies describes dangerous situations using single image input. report classification accuracy 97 % meteor score 16.2. make dataset publicly available paper accepted.",4 "real-coded chemical reaction optimization different perturbation functions. chemical reaction optimization (cro) powerful metaheuristic mimics interactions molecules chemical reactions search global optimum. perturbation function greatly influences performance cro solving different continuous problems. paper, study four different probability distributions, namely, gaussian distribution, cauchy distribution, exponential distribution, modified rayleigh distribution, perturbation function cro. different distributions different impacts solutions. distributions tested set well-known benchmark functions simulation results show problems different characteristics different preference distribution function. study gives guidelines design cro different types optimization problems.",4 "local neighborhood intensity pattern: new texture feature descriptor image retrieval. paper, new texture descriptor based local neighborhood intensity difference proposed content based image retrieval (cbir). computation texture features like local binary pattern (lbp), center pixel 3*3 window image compared remaining neighbors, one pixel time generate binary bit pattern. ignores effect adjacent neighbors particular pixel binary encoding also texture description. proposed method based concept neighbors particular pixel hold significant amount texture information considered efficient texture representation cbir. taking account, develop new texture descriptor, named local neighborhood intensity pattern (lnip) considers relative intensity difference particular pixel center pixel considering adjacent neighbors generate sign magnitude pattern. since sign magnitude patterns hold complementary information other, two patterns concatenated single feature descriptor generate concrete useful feature descriptor. proposed descriptor tested image retrieval four databases, including three texture image databases - brodatz texture image database, mit vistex database salzburg texture database one face database at&t face database. precision recall values observed databases compared state-of-art local patterns. proposed method showed significant improvement many existing methods.",4 "admm-based networked stochastic variational inference. owing recent advances ""big data"" modeling prediction tasks, variational bayesian estimation gained popularity due ability provide exact solutions approximate posteriors. one key technique approximate inference stochastic variational inference (svi). svi poses variational inference stochastic optimization problem solves iteratively using noisy gradient estimates. aims handle massive data predictive classification tasks applying complex bayesian models observed well latent variables. paper aims decentralize allowing parallel computation, secure learning robustness benefits. use alternating direction method multipliers top-down setting develop distributed svi algorithm independent learners running inference algorithms require sharing estimated model parameters instead private datasets. work extends distributed svi-admm algorithm first propose, admm-based networked svi algorithm learners working distributively share information according rules graph form network. kind work lies umbrella `deep learning networks' verify algorithm topic-modeling problem corpus wikipedia articles. illustrate results latent dirichlet allocation (lda) topic model large document classification, compare performance centralized algorithm, use numerical experiments corroborate analytical results.",4 "quality priority ratios estimation relation selected prioritization procedure consistency measure pairwise comparison matrix. overview current debates contemporary research devoted modeling decision making processes facilitation directs attention analytic hierarchy process (ahp). core ahp various prioritization procedures (pps) consistency measures (cms) pairwise comparison matrix (pcm) which, sense, reflects preferences decision makers. certainly, judgments preferences perfectly consistent (cardinally transitive), pps coincide quality priority ratios (prs) estimation exemplary. however, human judgments rarely consistent, thus quality prs estimation may significantly vary. scale variations depends applied pp utilized cm pcm. important find pps cms pcm lead directly improvement prs estimation accuracy. main goal research realized properly designed, coded executed seminal sophisticated simulation algorithms wolfram mathematica 8.0. research results convince embedded ahp commonly applied, genuine pp cm pcm may significantly deteriorate quality prs estimation; however, solutions proposed paper significantly improve methodology.",4 "anatomy search mining system digital archives. samtla (search mining tools linguistic analysis) digital humanities system designed collaboration historians linguists assist research work quantifying content textual corpora approximate phrase search document comparison. retrieval engine uses character-based n-gram language model rather conventional word-based one achieve great flexibility language agnostic query processing. index implemented space-optimised character-based suffix tree accompanying database document content metadata. number text mining tools integrated system allow researchers discover textual patterns, perform comparative analysis, find currently popular research community. herein describe system architecture, user interface, models algorithms, data storage samtla system. also present several case studies usage practice together evaluation systems' ranking performance crowdsourcing.",4 "fast learning rate deep learning via kernel perspective. develop new theoretical framework analyze generalization error deep learning, derive new fast learning rate two representative algorithms: empirical risk minimization bayesian deep learning. series theoretical analyses deep learning revealed high expressive power universal approximation capability. although analyses highly nonparametric, existing generalization error analyses developed mainly fixed dimensional parametric model. compensate gap, develop infinite dimensional model based integral form performed analysis universal approximation capability. allows us define reproducing kernel hilbert space corresponding layer. point view deal ordinary finite dimensional deep neural network finite approximation infinite dimensional one. approximation error evaluated degree freedom reproducing kernel hilbert space layer. estimate good finite dimensional model, consider empirical risk minimization bayesian deep learning. derive generalization error bound shown appears bias-variance trade-off terms number parameters finite dimensional approximation. show optimal width internal layers determined degree freedom convergence rate faster $o(1/\sqrt{n})$ rate shown existing studies.",12 "automatic vertebra labeling large-scale 3d ct using deep image-to-image network message passing sparsity regularization. automatic localization labeling vertebra 3d medical images plays important role many clinical tasks, including pathological diagnosis, surgical planning postoperative assessment. however, unusual conditions pathological cases, abnormal spine curvature, bright visual imaging artifacts caused metal implants, limited field view, increase difficulties accurate localization. paper, propose automatic fast algorithm localize label vertebra centroids 3d ct volumes. first, deploy deep image-to-image network (di2in) initialize vertebra locations, employing convolutional encoder-decoder architecture together multi-level feature concatenation deep supervision. next, centroid probability maps di2in iteratively evolved message passing schemes based mutual relation vertebra centroids. finally, localization results refined sparsity regularization. proposed method evaluated public dataset 302 spine ct volumes various pathologies. method outperforms state-of-the-art methods terms localization accuracy. run time around 3 seconds average per case. boost performance, retrain di2in additional 1000+ 3d ct volumes different patients. best knowledge, first time 1000 3d ct volumes expert annotation adopted experiments anatomic landmark detection tasks. experimental results show training large dataset significantly improves performance overall identification rate, first time knowledge, reaches 90 %.",4 "training restricted boltzmann machines word observations. restricted boltzmann machine (rbm) flexible tool modeling complex data, however significant computational difficulties using rbms model high-dimensional multinomial observations. natural language processing applications, words naturally modeled k-ary discrete distributions, k determined vocabulary size easily hundreds thousands. conventional approach training rbms word observations limited requires sampling states k-way softmax visible units block gibbs updates, operation takes time linear k. work, address issue employing general class markov chain monte carlo operators visible units, yielding updates computational complexity independent k. demonstrate success approach training rbms hundreds millions word n-grams using larger vocabularies previously feasible using learned features improve performance chunking sentiment classification tasks, achieving state-of-the-art results latter.",4 "m3: scaling machine learning via memory mapping. process data fit ram, conventional wisdom would suggest using distributed approaches. however, recent research demonstrated virtual memory's strong potential scaling graph mining algorithms single machine. propose use similar approach general machine learning. contribute: (1) latest finding memory mapping also feasible technique scaling general machine learning algorithms like logistic regression k-means, data fits exceeds ram (we tested datasets 190gb); (2) approach, called m3, enables existing machine learning algorithms work out-of-core datasets memory mapping, achieving speed significantly faster 4-instance spark cluster, comparable 8-instance cluster.",4 "information assisted dictionary learning fmri data analysis. extracting information functional magnetic resonance images (fmri) major area research many years, still demanding accurate techniques. nowadays, plenty available information brain-behavior used develop precise methods. thus, paper presents new dictionary learning method allows incorporating external information regarding studied problem, novel sets constraints. finally, apply proposed method synthetic fmri data, several tests show improvement performance compared common techniques.",19 "use skeletons learning bayesian networks. paper, present heuristic operator aims simultaneously optimizing orientations edges intermediate bayesian network structure search process. done alternating space directed acyclic graphs (dags) space skeletons. found orientations edges based scoring function rather induced conditional independences. operator used extension commonly employed search strategies. evaluated experiments artificial real-world data.",4 additive non-negative matrix factorization missing data. non-negative matrix factorization (nmf) previously shown useful decomposition multivariate data. interpret factorization new way use generate missing attributes test data. provide joint optimization scheme missing attributes well nmf factors. prove monotonic convergence algorithms. present classification results cases missing attributes.,4 "visual decoding targets visual search human eye fixations. human gaze reveal users' intents extend intents inferred even visualized? gaze proposed implicit source information predict target visual search and, recently, predict object class attributes search target. work, go one step investigate feasibility combining recent advances encoding human gaze information using deep convolutional neural networks power generative image models visually decode, i.e. create visual representation of, search target. visual decoding challenging two reasons: 1) search target resides user's mind subjective visual pattern, often even described verbally person, 2) is, yet, unclear gaze fixations contain sufficient information task all. show, first time, visual representations search targets indeed decoded human gaze fixations. propose first encode fixations semantic representation decode representation image. evaluate method recent gaze dataset 14 participants searching clothing image collages validate model's predictions using two human studies. results show 62% (chance level = 10%) time users able select categories decoded image right. second studies show importance local gaze encoding decoding visual search targets user",4 "genetic algorithm approach solving flexible job shop scheduling problem. flexible job shop scheduling noticed effective manufacturing system cope rapid development today's competitive environment. flexible job shop scheduling problem (fjssp) known np-hard problem field optimization. considering dynamic state real world makes problem complicated. studies field fjssp focused minimizing total makespan. paper, mathematical model fjssp developed. objective function maximizing total profit meeting constraints. time-varying raw material costs selling prices dissimilar demands period, considered decrease gaps reality model. manufacturer produces various parts gas valves used case study. scheduling problem multi-part, multi-period, multi-operation parallel machines solved using genetic algorithm (ga). best obtained answer determines economic amount production different machines belong predefined operations part satisfy customer demand period.",12 "monte carlo tree search sampled information relaxation dual bounds. monte carlo tree search (mcts), famously used game-play artificial intelligence (e.g., game go), well-known strategy constructing approximate solutions sequential decision problems. primary innovation use heuristic, known default policy, obtain monte carlo estimates downstream values states decision tree. information used iteratively expand tree towards regions states actions optimal policy might visit. however, guarantee convergence optimal action, mcts requires entire tree expanded asymptotically. paper, propose new technique called primal-dual mcts utilizes sampled information relaxation upper bounds potential actions, creating possibility ""ignoring"" parts tree stem highly suboptimal choices. allows us prove despite converging partial decision tree limit, recommended action primal-dual mcts optimal. new approach shows significant promise used optimize behavior single driver navigating graph operating ride-sharing platform. numerical experiments real dataset 7,000 trips new jersey suggest primal-dual mcts improves upon standard mcts producing deeper decision trees exhibits reduced sensitivity size action space.",12 "surprise: youve got explaining. events surprising others? propose events difficult explain surprising. two experiments reported test impact different event outcomes (outcome-type) task demands (task) ratings surprise simple story scenarios. outcome-type variable, participants saw outcomes either known less-known surprising outcomes scenario. task variable, participants either answered comprehension questions provided explanation outcome. outcome-type reliably affected surprise judgments; known outcomes rated less surprising less-known outcomes. task also reliably affected surprise judgments; people provided explanation lowered surprise judgments relative simply answering comprehension questions. experiments thus provide evidence less-explored explanation aspect surprise, specifically showing ease explanation key factor determining level surprise experienced.",4 "analysis parallelized motion masking using dual-mode single gaussian models. motion detection video important number applications fields. video surveillance, motion detection essential accompaniment activity recognition early warning systems. robotics also much gain motion detection segmentation, particularly high speed motion tracking tactile systems. myriad techniques detecting masking motion image. successful systems used gaussian models discern background foreground image (motion static imagery). however, particularly case moving camera frame reference, necessary compensate motion camera attempting discern objects moving foreground. example, possible estimate motion camera optical flow methods temporal differencing compensate motion background subtraction model. selection method yi et al. using dual-mode single gaussian models this. implement technique intel's thread building blocks (tbb) nvidia's cuda libraries. compare parallelization improvements theoretical analysis speedups based characteristics selected model attributes tbb cuda. make implementation available public.",4 "museum exhibit identification challenge domain adaptation beyond. paper, approach open problem artwork identification propose new dataset dubbed open museum identification challenge (open mic). contains photos exhibits captured 10 distinct exhibition spaces several museums showcase paintings, timepieces, sculptures, glassware, relics, science exhibits, natural history pieces, ceramics, pottery, tools indigenous crafts. goal open mic stimulate research domain adaptation, egocentric recognition few-shot learning providing testbed complementary famous office dataset reaches 90% accuracy. form dataset, captured number images per art piece mobile phone wearable cameras form source target data splits, respectively. achieve robust baselines, build recent approach aligns per-class scatter matrices source target cnn streams [15]. moreover, exploit positive definite nature representations using end-to-end bregman divergences riemannian metric. present baselines training/evaluation per exhibition training/evaluation combined set covering 866 exhibit identities. exhibition poses distinct challenges e.g., quality lighting, motion blur, occlusions, clutter, viewpoint scale variations, rotations, glares, transparency, non-planarity, clipping, break results w.r.t. factors.",4 "characteristic kernels infinitely divisible distributions. connect shift-invariant characteristic kernels infinitely divisible distributions $\mathbb{r}^{d}$. characteristic kernels play important role machine learning applications kernel means distinguish two probability measures. contribution paper two-fold. first, show, using l\'evy-khintchine formula, shift-invariant kernel given bounded, continuous symmetric probability density function (pdf) infinitely divisible distribution $\mathbb{r}^d$ characteristic. also present closure property characteristic kernels addition, pointwise product, convolution. second, developing various kernel mean algorithms, fundamental compute following values: (i) kernel mean values $m_p(x)$, $x \in \mathcal{x}$, (ii) kernel mean rkhs inner products ${\left\langle m_p, m_q \right\rangle_{\mathcal{h}}}$, probability measures $p, q$. $p, q$, kernel $k$ gaussians, computation (i) (ii) results gaussian pdfs tractable. generalize gaussian combination general cases class infinitely divisible distributions. introduce {\it conjugate} kernel {\it convolution trick}, (i) (ii) pdf form, expecting tractable computation least cases. specific instances, explore $\alpha$-stable distributions rich class generalized hyperbolic distributions, laplace, cauchy student-t distributions included.",19 "adapting deep visuomotor representations weak pairwise constraints. real-world robotics problems often occur domains differ significantly robot's prior training environment. many robotic control tasks, real world experience expensive obtain, data easy collect either instrumented environment simulation. propose novel domain adaptation approach robot perception adapts visual representations learned large easy-to-obtain source dataset (e.g. synthetic images) target real-world domain, without requiring expensive manual data annotation real world data policy search. supervised domain adaptation methods minimize cross-domain differences using pairs aligned images contain object scene source target domains, thus learning domain-invariant representation. however, require manual alignment image pairs. fully unsupervised adaptation methods rely minimizing discrepancy feature distributions across domains. propose novel, powerful combination distribution pairwise image alignment, remove requirement expensive annotation using weakly aligned pairs images source target domains. focusing adapting simulation real world data using pr2 robot, evaluate approach manipulation task show using weakly paired images, method compensates domain shift effectively previous techniques, enabling better robot performance real world.",4 "model selection nonlinear embedding unsupervised domain adaptation. domain adaptation deals adapting classifiers trained data source distribution, work effectively data target distribution. paper, introduce nonlinear embedding transform (net) unsupervised domain adaptation. net reduces cross-domain disparity nonlinear domain alignment. also embeds domain-aligned data similar data points clustered together. results enhanced classification. determine parameters net model (and unsupervised domain adaptation models), introduce validation procedure sampling source data points similar distribution target data. test net validation procedure using popular image datasets compare classification results across competitive procedures unsupervised domain adaptation.",4 "improvement/extension modular systems combinatorial reengineering (survey). paper describes development (improvement/extension) approaches composite (modular) systems (as combinatorial reengineering). following system improvement/extension actions considered: (a) improvement systems component(s) (e.g., improvement system component, replacement system component); (b) improvement system component interconnection (compatibility); (c) joint improvement improvement system components(s) interconnection; (d) improvement system structure (replacement system part(s), addition system part, deletion system part, modification system structure). study system improvement approaches involve crucial issues: (i) scales evaluation system components component compatibility (quantitative scale, ordinal scale, poset-like scale, scale based interval multiset estimate), (ii) evaluation integrated system quality, (iii) integration methods obtain integrated system quality. system improvement/extension strategies examined seleciton/combination improvement action(s) modification system structure. strategies based combinatorial optimization problems (e.g., multicriteria selection, knapsack problem, multiple choice problem, combinatorial synthesis based morphological clique problem, assignment/reassignment problem, graph recoloring problem, spanning problems, hotlink assignment). here, heuristics used. various system improvement/extension strategies presented including illustrative numerical examples.",4 "learning efficient image representation person re-identification. color names based image representation successfully used person re-identification, due advantages compact, intuitively understandable well robust photometric variance. however, exists diversity underlying distribution color names' rgb values image pixels' rgb values, may lead inaccuracy directly comparing euclidean space. paper, propose new method named soft gaussian mapping (sgm) address problem. model discrepancies color names pixels using gaussian utilize inverse covariance matrix bridge gap them. based sgm, image could converted several soft gaussian maps. soft gaussian map, seek establish stable robust descriptors within local region max pooling operation. then, robust image representation based color names obtained concatenating statistical descriptors stripe. labeled data available, one discriminative subspace projection matrix learned build efficient representations image via cross-view coupling learning. experiments public datasets - viper, prid450s cuhk03, demonstrate effectiveness method.",4 "benchmarking decoupled neural interfaces synthetic gradients. artifical neural networks particular class learning systems modeled biological neural functions interesting penchant hebbian learning, ""neurons wire together, fire together"". however, unlike natural counterparts, artificial neural networks close stringent coupling modules neurons network. coupling locking imposes upon network strict inflexible structure prevent layers network updating weights full feed-forward backward pass occurred. constraint though may sufficed while, longer feasible era very-large-scale machine learning, coupled increased desire parallelization learning process across multiple computing infrastructures. solve problem, synthetic gradients (sg) decoupled neural interfaces (dni) introduced viable alternative backpropagation algorithm. paper performs speed benchmark compare speed accuracy capabilities sg-dni opposed standard neural interface using multilayer perceptron mlp. sg-dni shows good promise, captures learning problem, also 3-fold faster due asynchronous learning capabilities.",4 "superpixel based segmentation classification polyps wireless capsule endoscopy. wireless capsule endoscopy (wce) relatively new technology record entire gi trace, vivo. large amounts frames captured examination cause difficulties physicians review frames. need reducing reviewing time using intelligent methods challenge. polyps considered growing tissues surface intestinal tract inside organ. polyps cancerous, one becomes larger centimeter, turn cancer great chance. wce frames provide early stage possibility detection polyps. here, application simple linear iterative clustering (slic) superpixel segmentation polyps wce frames evaluated. different slic superpixel numbers examined find highest sensitivity detection polyps. slic superpixel segmentation promising improve results previous studies. finally, superpixels classified using support vector machine (svm) extracting texture color features. classification results showed sensitivity 91%.",4 "semi-supervised cross-entropy clustering information bottleneck constraint. paper, propose semi-supervised clustering method, cec-ib, models data set gaussian distributions retrieves clusters based partial labeling provided user (partition-level side information). combining ideas cross-entropy clustering (cec) information bottleneck method (ib), method trades three conflicting goals: accuracy data set modeled, simplicity model, consistency clustering side information. experiments demonstrate cec-ib performance comparable gaussian mixture models (gmm) classical semi-supervised scenario, faster, robust noisy labels, automatically determines optimal number clusters, performs well classes present side information. moreover, contrast semi-supervised models, successfully applied discovering natural subgroups partition-level side information derived top levels hierarchical clustering.",4 "histogram oriented principal components cross-view action recognition. existing techniques 3d action recognition sensitive viewpoint variations extract features depth images viewpoint dependent. contrast, directly process pointclouds cross-view action recognition unknown unseen views. propose histogram oriented principal components (hopc) descriptor robust noise, viewpoint, scale action speed variations. 3d point, hopc computed projecting three scaled eigenvectors pointcloud within local spatio-temporal support volume onto vertices regular dodecahedron. hopc also used detection spatio-temporal keypoints (stk) 3d pointcloud sequences view-invariant stk descriptors (or local hopc descriptors) key locations used action recognition. also propose global descriptor computed normalized spatio-temporal distribution stks 4-d, refer stk-d. evaluated performance proposed descriptors nine existing techniques two cross-view three single-view human action recognition datasets. experimental results show techniques provide significant improvement state-of-the-art methods.",4 "ruber: unsupervised method automatic evaluation open-domain dialog systems. open-domain human-computer conversation attracting increasing attention past years. however, exist standard automatic evaluation metric open-domain dialog systems; researchers usually resort human annotation model evaluation, time- labor-intensive. paper, propose ruber, referenced metric unreferenced metric blended evaluation routine, evaluates reply taking consideration groundtruth reply query (previous user-issued utterance). metric learnable, training require labels human satisfaction. hence, ruber flexible extensible different datasets languages. experiments retrieval generative dialog systems show ruber high correlation human annotation.",4 "solving factored mdps hybrid state action variables. efficient representations solutions large decision problems continuous discrete variables among important challenges faced designers automated decision support systems. paper, describe novel hybrid factored markov decision process (mdp) model allows compact representation problems, new hybrid approximate linear programming (halp) framework permits efficient solutions. central idea halp approximate optimal value function linear combination basis functions optimize weights linear programming. analyze theoretical computational aspects approach, demonstrate scale-up potential several hybrid optimization problems.",4 "global analysis expectation maximization mixtures two gaussians. expectation maximization (em) among popular algorithms estimating parameters statistical models. however, em, iterative algorithm based maximum likelihood principle, generally guaranteed find stationary points likelihood objective, points may far maximizer. article addresses disconnect statistical principles behind em algorithmic properties. specifically, provides global analysis em specific models observations comprise i.i.d. sample mixture two gaussians. achieved (i) studying sequence parameters idealized execution em infinite sample limit, fully characterizing limit points sequence terms initial parameters; (ii) based convergence analysis, establishing statistical consistency (or lack thereof) actual sequence parameters produced em.",12 "least angle $\ell_1$ penalized regression: review. least angle regression promising technique variable selection applications, offering nice alternative stepwise regression. provides explanation similar behavior lasso ($\ell_1$-penalized regression) forward stagewise regression, provides fast implementation both. idea caught rapidly, sparked great deal research interest. paper, give overview least angle regression current state related research.",19 "model selection topic models via spectral decomposition. topic models achieved significant successes analyzing large-scale text corpus. practical applications, always confronted challenge model selection, i.e., appropriately set number topics. following recent advances topic model inference via tensor decomposition, make first attempt provide theoretical analysis model selection latent dirichlet allocation. mild conditions, derive upper bound lower bound number topics given text collection finite size. experimental results demonstrate bounds accurate tight. furthermore, using gaussian mixture model example, show methodology easily generalized model selection analysis latent models.",19 "synkhronos: multi-gpu theano extension data parallelism. present synkhronos, extension theano multi-gpu computations leveraging data parallelism. framework provides automated execution synchronization across devices, allowing users continue write serial programs without risk race conditions. nvidia collective communication library used high-bandwidth inter-gpu communication. enhancements theano function interface include input slicing (with aggregation) input indexing, perform common data-parallel computation patterns efficiently. one example use case synchronous sgd, recently shown scale well growing set deep learning problems. training resnet-50, achieve near-linear speedup 7.5x nvidia dgx-1 using 8 gpus, relative theano-only code running single gpu isolation. yet synkhronos remains general data-parallel computation programmable theano. implementing parallelism level individual theano functions, framework uniquely addresses niche manual multi-device programming prescribed multi-gpu training routines.",4 "deepskeleton: learning multi-task scale-associated deep side outputs object skeleton extraction natural images. object skeletons useful object representation object detection. complementary object contour, provide extra information, object scale (thickness) varies among object parts. object skeleton extraction natural images challenging, requires extractor able capture local non-local image context order determine scale skeleton pixel. paper, present novel fully convolutional network multiple scale-associated side outputs address problem. observing relationship receptive field sizes different layers network skeleton scales capture, introduce two scale-associated side outputs stage network. network trained multi-task learning, one task skeleton localization classify whether pixel skeleton pixel not, skeleton scale prediction regress scale skeleton pixel. supervision imposed different stages guiding scale-associated side outputs toward groundtruth skeletons appropriate scales. responses multiple scale-associated side outputs fused scale-specific way detect skeleton pixels using multiple scales effectively. method achieves promising results two skeleton extraction datasets, significantly outperforms competitors. additionally, usefulness obtained skeletons scales (thickness) verified two object detection applications: foreground object segmentation object proposal detection.",4 "particle approximations score observed information matrix parameter estimation state space models linear computational cost. poyiadjis et al. (2011) show particle methods used estimate score observed information matrix state space models. methods either suffer computational cost quadratic number particles, produce estimates whose variance increases quadratically amount data. paper introduces alternative approach estimating terms computational cost linear number particles. method derived using combination kernel density estimation, avoid particle degeneracy causes quadratically increasing variance, rao-blackwellisation. crucially, show method robust choice bandwidth within kernel density estimation, good asymptotic properties regardless choice. estimates score observed information matrix used within online batch procedures estimating parameters state space models. empirical results show improved parameter estimates compared existing methods significantly reduced computational cost. supplementary materials including code available.",19 "postprocessing compressed images via sequential denoising. work propose novel postprocessing technique compression-artifact reduction. approach based posing task inverse problem, regularization leverages existing state-of-the-art image denoising algorithms. rely recently proposed plug-and-play prior framework, suggesting solution general inverse problems via alternating direction method multipliers (admm), leading sequence gaussian denoising steps. key feature scheme linearization compression-decompression process, get formulation optimized. addition, supply thorough analysis linear approximation several basic compression procedures. proposed method suitable diverse compression techniques rely transform coding. specifically, demonstrate impressive gains image quality several leading compression methods - jpeg, jpeg2000, hevc.",4 "simultaneous traffic sign detection boundary estimation using convolutional neural network. propose novel traffic sign detection system simultaneously estimates location precise boundary traffic signs using convolutional neural network (cnn). estimating precise boundary traffic signs important navigation systems intelligent vehicles traffic signs used 3d landmarks road environment. previous traffic sign detection systems, including recent methods based cnn, provide bounding boxes traffic signs output, thus requires additional processes contour estimation image segmentation obtain precise sign boundary. work, boundary estimation traffic signs formulated 2d pose shape class prediction problem, effectively solved single cnn. predicted 2d pose shape class target traffic sign input image, estimate actual boundary target sign projecting boundary corresponding template sign image input image plane. formulating boundary estimation problem cnn-based pose shape prediction task, method end-to-end trainable, robust occlusion small targets boundary estimation methods rely contour estimation image segmentation. proposed method architectural optimization provides accurate traffic sign boundary estimation also efficient compute, showing detection frame rate higher 7 frames per second low-power mobile platforms.",4 "novel feature extraction, selection fusion effective malware family classification. modern malware designed mutation characteristics, namely polymorphism metamorphism, causes enormous growth number variants malware samples. categorization malware samples basis behaviors essential computer security community, receive huge number malware everyday, signature extraction process usually based malicious parts characterizing malware families. microsoft released malware classification challenge 2015 huge dataset near 0.5 terabytes data, containing 20k malware samples. analysis dataset inspired development novel paradigm effective categorizing malware variants actual family groups. paradigm presented discussed present paper, emphasis given phases related extraction, selection set novel features effective representation malware samples. features grouped according different characteristics malware behavior, fusion performed according per-class weighting paradigm. proposed method achieved high accuracy ($\approx$ 0.998) microsoft malware challenge dataset.",4 "stochastic multi-armed bandits constant space. consider stochastic bandit problem sublinear space setting, one cannot record win-loss record $k$ arms. give algorithm using $o(1)$ words space regret \[ \sum_{i=1}^{k}\frac{1}{\delta_i}\log \frac{\delta_i}{\delta}\log \] $\delta_i$ gap best arm arm $i$ $\delta$ gap best second-best arms. rewards bounded away $0$ $1$, within $o(\log 1/\delta)$ factor optimum regret possible without space constraints.",4 "demystifying neural style transfer. neural style transfer recently demonstrated exciting results catches eyes academia industry. despite amazing results, principle neural style transfer, especially gram matrices could represent style remains unclear. paper, propose novel interpretation neural style transfer treating domain adaptation problem. specifically, theoretically show matching gram matrices feature maps equivalent minimize maximum mean discrepancy (mmd) second order polynomial kernel. thus, argue essence neural style transfer match feature distributions style images generated images. support standpoint, experiment several distribution alignment methods, achieve appealing results. believe novel interpretation connects two important research fields, could enlighten future researches.",4 "neural machine translation via binary code prediction. paper, propose new method calculating output layer neural machine translation systems. method based predicting binary code word reduce computation time/memory requirements output layer logarithmic vocabulary size best case. addition, also introduce two advanced approaches improve robustness proposed model: using error-correcting codes combining softmax binary codes. experiments two english-japanese bidirectional translation tasks show proposed models achieve bleu scores approach softmax, reducing memory usage order less 1/10 improving decoding speed cpus x5 x10.",4 "mining causal relationships: data-driven study islamic state. islamic state iraq al-sham (isis) dominant insurgent group operating iraq syria rose prominence took mosul june, 2014. paper, present data-driven approach analyzing group using dataset consisting 2200 incidents military activity surrounding isis forces oppose (including iraqi, syrian, american-led coalition). combine ideas logic programming causal reasoning mine association rules present evidence causality. present relationships link isis vehicle-bourne improvised explosive device (vbied) activity syria military operations iraq, coalition air strikes, isis ied activity, well rules may serve indicators spikes indirect fire, suicide attacks, arrests.",4 "infinite-horizon policy-gradient estimation. gradient-based approaches direct policy search reinforcement learning received much recent attention means solve problems partial observability avoid problems associated policy degradation value-function methods. paper introduce gpomdp, simulation-based algorithm generating biased estimate gradient average reward partially observable markov decision processes pomdps controlled parameterized stochastic policies. similar algorithm proposed (kimura et al. 1995). algorithm's chief advantages requires storage twice number policy parameters, uses one free beta (which natural interpretation terms bias-variance trade-off), requires knowledge underlying state. prove convergence gpomdp, show correct choice parameter beta related mixing time controlled pomdp. briefly describe extensions gpomdp controlled markov chains, continuous state, observation control spaces, multiple-agents, higher-order derivatives, version training stochastic policies internal states. companion paper (baxter et al., volume) show gradient estimates generated gpomdp used traditional stochastic gradient algorithm conjugate-gradient procedure find local optima average reward.",4 "tensor approach learning mixed membership community models. community detection task detecting hidden communities observed interactions. guaranteed community detection far mostly limited models non-overlapping communities stochastic block model. paper, remove restriction, provide guaranteed community detection family probabilistic network models overlapping communities, termed mixed membership dirichlet model, first introduced airoldi et al. model allows nodes fractional memberships multiple communities assumes community memberships drawn dirichlet distribution. moreover, contains stochastic block model special case. propose unified approach learning models via tensor spectral decomposition method. estimator based low-order moment tensor observed network, consisting 3-star counts. learning method fast based simple linear algebraic operations, e.g. singular value decomposition tensor power iterations. provide guaranteed recovery community memberships model parameters present careful finite sample analysis learning method. important special case, results match best known scaling requirements (homogeneous) stochastic block model.",4 "machine learning phonologically conditioned noun declensions tamil morphological generators. paper presents machine learning solutions practical problem natural language generation (nlg), particularly word formation agglutinative languages like tamil, supervised manner. morphological generator important component natural language processing artificial intelligence. generates word forms given root affixes. morphophonemic changes like addition, deletion, alternation etc., occur two morphemes words joined together. sandhi rules explicitly specified rule based morphological analyzers generators. machine learning framework, rules learned automatically system training samples subsequently applied new inputs. paper proposed machine learning models learn morphophonemic rules noun declensions given training data. models trained learn sandhi rules using various learning algorithms performance algorithms presented. conclude machine learning morphological processing word form generation successfully learned supervised manner, without explicit description rules. performance decision trees bayesian machine learning algorithms noun declensions discussed.",4 "probabilistic label relation graphs ising models. consider classification problems label space structure. common example hierarchical label spaces, corresponding case one label subsumes another (e.g., animal subsumes dog). labels also mutually exclusive (e.g., dog vs cat) unrelated (e.g., furry, carnivore). jointly model hierarchy exclusion relations, notion hex (hierarchy exclusion) graph introduced [7]. combined conditional random field (crf) deep neural network (dnn), resulting state art results applied visual object classification problems training labels drawn different levels imagenet hierarchy (e.g., image might labeled basic level category ""dog"", rather specific label ""husky""). paper, extend hex model allow soft probabilistic relations labels, useful uncertainty relationship two labels (e.g., antelope ""sort of"" furry, degree grizzly bear). call new model phex, probabilistic hex. show phex graph converted ising model, allows us use existing off-the-shelf inference methods (in contrast hex method, needed specialized inference algorithms). experimental results show significant improvements number large-scale visual object classification tasks, outperforming previous hex model.",4 "behavioral learning aircraft landing sequencing using society probabilistic finite state machines. air traffic control (atc) complex safety critical environment. tower controller would making many decisions real-time sequence aircraft. optimization tools exist help controller airports, even situations, real sequence aircraft adopted controller significantly different one proposed optimization algorithm. due dynamic nature environment. objective paper test hypothesis one learn sequence adopted controller strategies act heuristics decision support tools aircraft sequencing. aim tested paper attempting learn sequences generated well-known sequencing method used real world. approach relies genetic algorithm (ga) learn sequences using society probabilistic finite-state machines (pfsms). pfsm learns different sub-space; thus, decomposing learning problem group agents need work together learn overall problem. three sequence metrics (levenshtein, hamming position distances) compared fitness functions ga. results suggest, possible learn behavior algorithm/heuristic generated original sequence limited information.",4 "high-dimensional dynamics generalization error neural networks. perform average case analysis generalization dynamics large neural networks trained using gradient descent. study practically-relevant ""high-dimensional"" regime number free parameters network order even larger number examples dataset. using random matrix theory exact solutions linear models, derive generalization error training error dynamics learning analyze depend dimensionality data signal noise ratio learning problem. find dynamics gradient descent learning naturally protect overtraining overfitting large networks. overtraining worst intermediate network sizes, effective number free parameters equals number samples, thus reduced making network smaller larger. additionally, high-dimensional regime, low generalization error requires starting small initial weights. turn non-linear neural networks, show making networks large harm generalization performance. contrary, fact reduce overtraining, even without early stopping regularization sort. identify two novel phenomena underlying behavior overcomplete models: first, frozen subspace weights learning occurs gradient descent; second, statistical properties high-dimensional regime yield better-conditioned input correlations protect overtraining. demonstrate naive application worst-case theories rademacher complexity inaccurate predicting generalization performance deep neural networks, derive alternative bound incorporates frozen subspace conditioning effects qualitatively matches behavior observed simulation.",19 information extraction broadcast news. paper discusses development trainable statistical models extracting content television radio news broadcasts. particular concentrate statistical finite state models identifying proper names named entities broadcast speech. two models presented: first represents name class information word attribute; second represents word-word class-class transitions explicitly. common n-gram based formulation used models. task named entity identification characterized relatively sparse training data issues related smoothing discussed. experiments reported using darpa/nist hub-4e evaluation north american broadcast news.,4 "horizontally scalable submodular maximization. variety large-scale machine learning problems cast instances constrained submodular maximization. existing approaches distributed submodular maximization critical drawback: capacity - number instances fit memory - must grow data set size. practice, one provision many machines, capacity machine limited physical constraints. propose truly scalable approach distributed submodular maximization fixed capacity. proposed framework applies broad class algorithms constraints provides theoretical guarantees approximation factor available capacity. empirically evaluate proposed algorithm variety data sets demonstrate achieves performance competitive centralized greedy solution.",19 "applying ensemble learning method improving multi-label classification performance. recent years, multi-label classification problem become controversial issue. kind classification, sample associated set class labels. ensemble approaches supervised learning algorithms operator takes number learning algorithms, namely base-level algorithms combines outcomes make estimation. simplest form ensemble learning train base-level algorithms random subsets data let vote popular classifications average predictions base-level algorithms. study, ensemble learning method proposed improving multi-label classification evaluation criteria. compared method well-known base-level algorithms data sets. experiment results show proposed approach outperforms base well-known classifiers multi-label classification problem.",4 "relative succinctness sentential decision diagrams. sentential decision diagrams (sdds) introduced darwiche 2011 promising representation type used knowledge compilation. relative succinctness representation types important subject area. aim paper identify kind boolean functions represented sdds small size respect number variables functions defined on. reason sets boolean functions representable different representation types polynomial size investigated sdds compared representation types classical knowledge compilation map darwiche marquis. ordered binary decision diagrams (obdds) popular data structure boolean functions one representation types. sdds general obdds definition recently, boolean function presented polynomial sdd size exponential obdd size. result strengthened several ways. main result quasipolynomial simulation sdds equivalent unambiguous nondeterministic obdds, nondeterministic variant exists exactly one accepting computation satisfying input. side effect open problem relative succinctness sdds free binary decision diagrams (fbdds) general obdds answered.",4 approximate reflection symmetry point set: theory algorithm application. propose algorithm detect approximate reflection symmetry present set volumetrically distributed points belonging $\mathbb{r}^d$ containing distorted reflection symmetry pattern. pose problem detecting approximate reflection symmetry problem establishing correspondences points reflections determining reflection symmetry transformation. formulate optimization framework problem establishing correspondences amounts solving linear assignment problem problem determining reflection symmetry transformation amounts optimization problem smooth riemannian product manifold. proposed approach estimates symmetry distribution points descriptor independent. evaluate robustness approach varying amount distortion perfect reflection symmetry pattern perturb point different amount perturbation. demonstrate effectiveness method applying problem 2-d reflection symmetry detection along relevant comparisons.,4 "dependency parsing dilated iterated graph cnns. dependency parses effective way inject linguistic knowledge many downstream tasks, many practitioners wish efficiently parse sentences scale. recent advances gpu hardware enabled neural networks achieve significant gains previous best models, models still fail leverage gpus' capability massive parallelism due requirement sequential processing sentence. response, propose dilated iterated graph convolutional neural networks (dig-cnns) graph-based dependency parsing, graph convolutional architecture allows efficient end-to-end gpu parsing. experiments english penn treebank benchmark, show dig-cnns perform par best neural network parsers.",4 "densenet: implementing efficient convnet descriptor pyramids. convolutional neural networks (cnns) provide accurate object classification. extended perform object detection iterating dense selected proposed object regions. however, runtime detectors scales total number and/or area regions examine per image, training detectors may prohibitively slow. however, cnn classifier topologies, possible share significant work among overlapping regions classified. paper presents densenet, open source system computes dense, multiscale features convolutional layers cnn based object classifier. future work involve training efficient object detectors densenet feature descriptors.",4 "glimpse far future: understanding long-term crowd worker quality. microtask crowdsourcing increasingly critical creation extremely large datasets. result, crowd workers spend weeks months repeating exact tasks, making necessary understand behavior long periods time. utilize three large, longitudinal datasets nine million annotations collected amazon mechanical turk examine claims workers fatigue satisfice long periods, producing lower quality work. find that, contrary claims, workers extremely stable quality entire period. understand whether workers set quality based task's requirements acceptance, perform experiment vary required quality large crowdsourcing task. workers adjust quality based acceptance threshold: workers threshold continued working usual quality level, workers threshold self-selected task. capitalizing consistency, demonstrate possible predict workers' long-term quality using glimpse quality first five tasks.",4 "clicks there!: anonymizing photographer camera saturated society. recent years, social media played increasingly important role reporting world events. publication crowd-sourced photographs videos near real-time one reasons behind high impact. however, use camera draw photographer situation conflict. examples include use cameras regulators collecting evidence mafia operations; citizens collecting evidence corruption public service outlet; political dissidents protesting public rallies. cases, published images contain fairly unambiguous clues location photographer (scene viewpoint information). presence adversary operated cameras, easy identify photographer also combining leaked information photographs themselves. call camera location detection attack. propose review defense techniques attacks. defenses image obfuscation techniques protect camera-location information; current anonymous publication technologies help either. however, use view synthesis algorithms could promising step direction providing probabilistic privacy guarantees.",4 "diffusercam: lensless single-exposure 3d imaging. demonstrate compact easy-to-build computational camera single-shot 3d imaging. lensless system consists solely diffuser placed front standard image sensor. every point within volumetric field-of-view projects unique pseudorandom pattern caustics sensor. using physical approximation simple calibration scheme, solve large-scale inverse problem computationally efficient way. caustic patterns enable compressed sensing, exploits sparsity sample solve 3d voxels pixels 2d sensor. 3d voxel grid chosen match experimentally measured two-point optical resolution across field-of-view, resulting 100 million voxels reconstructed single 1.3 megapixel image. however, effective resolution varies significantly scene content. effect common wide range computational cameras, provide new theory analyzing resolution systems.",4 "integration lidar hyperspectral data land-cover classification: case study. paper, approach proposed fuse lidar hyperspectral data, considers spectral spatial information single framework. here, extended self-dual attribute profile (esdap) investigated extract spatial information hyperspectral data set. extract spectral information, well-known classifiers used support vector machines (svms), random forests (rfs), artificial neural networks (anns). proposed method accurately classify relatively volumetric data set cpu processing time real ill-posed situation balance number training samples number features. classification part proposed approach fully-automatic.",4 "normalization based k means clustering algorithm. k-means effective clustering technique used separate similar data groups based initial centroids clusters. paper, normalization based k-means clustering algorithm(n-k means) proposed. proposed n-k means clustering algorithm applies normalization prior clustering available data well proposed approach calculates initial centroids based weights. experimental results prove betterment proposed n-k means clustering algorithm existing k-means clustering algorithm terms complexity overall performance.",4 "certifying existence epipolar matrices. given set point correspondences two images, existence fundamental matrix necessary condition points images 3-dimensional scene imaged two pinhole cameras. camera calibration known one requires existence essential matrix. present efficient algorithm, using exact linear algebra, testing existence fundamental matrix. input number point correspondences. essential matrices, characterize solvability demazure polynomials. scenarios, determine linear subspaces intersect fixed set defined non-linear polynomials. conditions derive polynomials stated purely terms image coordinates. represent new class two-view invariants, free fundamental (resp.~essential)~matrices.",4 "automated auto-encoder correlation-based health-monitoring prognostic method machine bearings. paper studies intelligent ultimate technique health-monitoring prognostic common rotary machine components, particularly bearings. run-to-failure experiment, rich unsupervised features vibration sensory data extracted trained sparse auto-encoder. then, correlation extracted attributes initial samples (presumably healthy beginning test) succeeding samples calculated passed moving-average filter. normalized output named auto-encoder correlation-based (aec) rate stands informative attribute system depicting health status precisely identifying degradation starting point. show aec technique well-generalizes several run-to-failure tests. aec collects rich unsupervised features form vibration data fully autonomous. demonstrate superiority aec many state-of-the-art approaches health monitoring prognostic machine bearings.",4 "retinal vessel segmentation fundoscopic images generative adversarial networks. retinal vessel segmentation indispensable step automatic detection retinal diseases fundoscopic images. though many approaches proposed, existing methods tend miss fine vessels allow false positives terminal branches. let alone under-segmentation, over-segmentation also problematic quantitative studies need measure precise width vessels. paper, present method generates precise map retinal vessels using generative adversarial training. methods achieve dice coefficient 0.829 drive dataset 0.834 stare dataset state-of-the-art performance datasets.",4 "inference networks evaluation evidence: alternative analyses. inference networks variety important uses constructed persons quite different standpoints. discussed paper three different complementary methods generating analyzing probabilistic inference networks. first method, though eighty years old, useful knowledge representation task constructing probabilistic arguments. also useful heuristic device generating new forms evidence. two methods formally equivalent ways combining probabilities analysis inference networks. use three methods illustrated analysis mass evidence celebrated american law case.",4 "neural network based nonlinear weighted finite automata. weighted finite automata (wfa) expressively model functions defined strings inherently linear models. given recent successes nonlinear models machine learning, natural wonder whether ex-tending wfa nonlinear setting would beneficial. paper, propose novel model neural network based nonlinearwfa model (nl-wfa) along learning algorithm. learning algorithm inspired spectral learning algorithm wfaand relies nonlinear decomposition so-called hankel matrix, means auto-encoder network. expressive power nl-wfa proposed learning algorithm assessed synthetic real-world data, showing nl-wfa lead smaller model sizes infer complex grammatical structures data.",4 "face synthesis (fasy) system determining characteristics face image. paper aims determining characteristics face image extracting components. fasy (face synthesis) system face database retrieval new face generation system development. one main features generation requested face found existing database, allows continuous growing database also. generate new face image, need store face components database. designed new technique extract face components sophisticated method. extraction facial feature points analyzed components determine characteristics. extraction analysis stored components along characteristics face database later use face construction.",4 "real-time human pose estimation video convolutional neural networks. paper, present method real-time multi-person human pose estimation video utilizing convolutional neural networks. method aimed use case specific applications, good accuracy essential variation background poses limited. enables us use generic network architecture, accurate fast. divide problem two phases: (1) pre-training (2) finetuning. pre-training, network learned highly diverse input data publicly available datasets, finetuning train application specific data, record kinect. method differs state-of-the-art methods consider whole system, including person detector, pose estimator automatic way record application specific training material finetuning. method considerably faster many state-of-the-art methods. method thought replacement kinect, used higher level tasks, gesture control, games, person tracking, action recognition action tracking. achieved accuracy 96.8\% (pck@0.2) application specific data.",4 "learned multi-patch similarity. estimating depth map multiple views scene fundamental task computer vision. soon two viewpoints available, one faces basic question measure similarity across >2 image patches. surprisingly, direct solution exists, instead common fall back less robust averaging two-view similarities. encouraged success machine learning, particular convolutional neural networks, propose learn matching function directly maps multiple image patches scalar similarity score. experiments several multi-view datasets demonstrate approach advantages methods based pairwise patch similarity.",4 "bet independence. study problem nonparametric dependence detection. many existing methods suffer severe power loss due non-uniform consistency, illustrate paradox. avoid power loss, approach nonparametric test independence new framework binary expansion statistics (bestat) binary expansion testing (bet), examine dependence novel binary expansion filtration approximation copula. hadamard-walsh transform, find cross interactions binary variables filtration complete sufficient statistics dependence. interactions also uncorrelated null. utilizing interactions, bet avoids problem non-uniform consistency improves upon wide class commonly used methods (a) achieving minimax rate sample size requirement specified power (b) providing clear interpretations global local relationships upon rejection independence. binary expansion approach also connects test statistics current computing system facilitate efficient bitwise implementation. illustrate bet study distribution stars night sky exploratory data analysis tcga breast cancer data.",12 "assessment algorithms mitosis detection breast cancer histopathology images. proliferative activity breast tumors, routinely estimated counting mitotic figures hematoxylin eosin stained histology sections, considered one important prognostic markers. however, mitosis counting laborious, subjective may suffer low inter-observer agreement. wider acceptance whole slide images pathology labs, automatic image analysis proposed potential solution issues. paper, results assessment mitosis detection algorithms 2013 (amida13) challenge described. challenge based data set consisting 12 training 11 testing subjects, one thousand annotated mitotic figures multiple observers. short descriptions results evaluation eleven methods presented. top performing method error rate comparable inter-observer agreement among pathologists.",4 "hierarchized block wise image approximation greedy pursuit strategies. approach effective implementation greedy selection methodologies, approximate image partitioned blocks, proposed. method specially designed approximating partitions transformed image. evolves selecting, iteration step, i) elements approximating blocks partitioning image ii) hierarchized sequence blocks approximated reach required global condition sparsity.",4 "abstract syntax networks code generation semantic parsing. tasks like code generation semantic parsing require mapping unstructured (or partially structured) inputs well-formed, executable outputs. introduce abstract syntax networks, modeling framework problems. outputs represented abstract syntax trees (asts) constructed decoder dynamically-determined modular structure paralleling structure output tree. benchmark hearthstone dataset code generation, model obtains 79.2 bleu 22.7% exact match accuracy, compared previous state-of-the-art values 67.1 6.1%. furthermore, perform competitively atis, jobs, geo semantic parsing datasets task-specific engineering.",4 "robust image registration via empirical mode decomposition. spatially varying intensity noise common source distortion images. bias field noise one example distortion often present magnetic resonance (mr) images. paper, first show empirical mode decomposition (emd) considerably reduce bias field noise mr images. then, propose two hierarchical multi-resolution emd-based algorithms robust registration images presence spatially varying noise. one algorithm (lr-emd) based registering emd feature-maps floating reference images various resolution levels. second algorithm (afr-emd), first extract average feature-map based emd floating reference images. then, use simple hierarchical multi-resolution algorithm based downsampling register average feature-maps. algorithms achieve lower error rate higher convergence percentage compared intensity-based hierarchical registration. specifically, using mutual information similarity measure, afr-emd achieves 42% lower error rate intensity 52% lower error rate transformation compared intensity-based hierarchical registration. lr-emd, error rate 32% lower intensity 41% lower transformation.",4 "asynchronous distributed variational gaussian processes regression. gaussian processes (gps) powerful non-parametric function estimators. however, applications largely limited expensive computational cost inference procedures. existing stochastic distributed synchronous variational inferences, although alleviated issue scaling gps millions samples, still far satisfactory real-world large applications, data sizes often orders magnitudes larger, say, billions. solve problem, propose advgp, first asynchronous distributed variational gaussian process inference regression, recent large-scale machine learning platform, parameterserver. advgp uses novel, flexible variational framework based weight space augmentation, implements highly efficient, asynchronous proximal gradient optimization. maintaining comparable better predictive performance, advgp greatly improves upon efficiency existing variational methods. advgp, effortlessly scale gp regression real-world application billions samples demonstrate excellent, superior prediction accuracy popular linear models.",19 "denoising adversarial autoencoders: classifying skin lesions using limited labelled training data. propose novel deep learning model classifying medical images setting large amount unlabelled medical data available, labelled data limited supply. consider specific case classifying skin lesions either malignant benign. setting, proposed approach -- semi-supervised, denoising adversarial autoencoder -- able utilise vast amounts unlabelled data learn representation skin lesions, small amounts labelled data assign class labels based learned representation. analyse contributions adversarial denoising components model find combination yields superior classification performance setting limited labelled training data.",4 "tagging multimedia stimuli ontologies. successful management emotional stimuli pivotal issue concerning affective computing (ac) related research. subfield artificial intelligence, ac concerned design computer systems accompanying hardware recognize, interpret, process human emotions, also development systems trigger human emotional response ordered controlled manner. requires maximum attainable precision efficiency extraction data emotionally annotated databases databases use keywords tags description semantic content, provide either necessary flexibility leverage needed efficiently extract pertinent emotional content. therefore, extent propose introduction ontologies new paradigm description emotionally annotated data. ability select sequence data based semantic attributes vital study involving metadata, semantics ontological sorting like semantic web social semantic desktop, approach described paper facilitates reuse areas well.",4 "nonparanormal: semiparametric estimation high dimensional undirected graphs. recent methods estimating sparse undirected graphs real-valued data high dimensional problems rely heavily assumption normality. show use semiparametric gaussian copula--or ""nonparanormal""--for high dimensional inference. additive models extend linear models replacing linear functions set one-dimensional smooth functions, nonparanormal extends normal transforming variables smooth functions. derive method estimating nonparanormal, study method's theoretical properties, show works well many examples.",19 "evaluating semantic models word-sentence relatedness. semantic textual similarity (sts) systems designed encode evaluate semantic similarity words, phrases, sentences, documents. one method assessing quality authenticity semantic information encoded systems comparison human judgments. data set evaluating semantic models developed consisting 775 english word-sentence pairs, annotated semantic relatedness human raters engaged maximum difference scaling (mds) task, well faster alternative task. sample application relatedness data, behavior-based relatedness compared relatedness computed via four off-the-shelf sts models: n-gram, latent semantic analysis (lsa), word2vec, umbc ebiquity. sts models captured much variance human judgments collected, sensitive implicatures entailments processed considered participants. text stimuli judgment data made freely available.",4 "best practice cnns applied visual instance retrieval?. previous work shown feature maps deep convolutional neural networks (cnns) interpreted feature representation particular image region. features aggregated feature maps exploited image retrieval tasks achieved state-of-the-art performances recent years. key success methods feature representation. however, different factors impact effectiveness features still explored thoroughly. much less discussion best combination them. main contribution paper thorough evaluations various factors affect discriminative ability features extracted cnns. based evaluation results, also identify best choices different factors propose new multi-scale image feature representation method encode image effectively. finally, show proposed method generalises well outperforms state-of-the-art methods four typical datasets used visual instance retrieval.",4 "computational cost reduction learned transform classifications. present theoretical analysis empirical evaluations novel set techniques computational cost reduction classifiers based learned transform soft-threshold. modifying optimization procedures dictionary classifier training, well resulting dictionary entries, techniques allow reduce bit precision replace floating-point multiplication single integer bit shift. also show optimization algorithms dictionary training methods modified penalize higher-energy dictionaries. applied techniques classifier learning algorithm soft-thresholding, testing datasets used original paper. results indicate feasible use solely sums bit shifts integers classify test time limited reduction classification accuracy. low power operations valuable trade fpga implementations increase classification throughput decrease energy consumption manufacturing cost.",4 "multi-scale deep learning architectures person re-identification. person re-identification (re-id) aims match people across non-overlapping camera views public space. challenging problem many people captured surveillance videos wear similar clothes. consequently, differences appearance often subtle detectable right location scales. existing re-id models, particularly recently proposed deep learning based ones match people single scale. contrast, paper, novel multi-scale deep learning model proposed. model able learn deep discriminative feature representations different scales automatically determine suitable scales matching. importance different spatial locations extracting discriminative features also learned explicitly. experiments carried demonstrate proposed model outperforms state-of-the art number benchmarks",4 "spatially encoding temporal correlations classify temporal data using convolutional neural networks. propose off-line approach explicitly encode temporal patterns spatially different types images, namely, gramian angular fields markov transition fields. enables use techniques computer vision feature learning classification. used tiled convolutional neural networks learn high-level features individual gaf, mtf, gaf-mtf images 12 benchmark time series datasets two real spatial-temporal trajectory datasets. classification results approach competitive state-of-the-art approaches types data. analysis features weights learned cnns explains approach works.",4 "benefit combining neural, statistical external features fake news identification. identifying veracity news article interesting problem automating process challenging task. detection news article fake still open question contingent many factors current state-of-the-art models fail incorporate. paper, explore subtask fake news identification, stance detection. given news article, task determine relevance body claim. present novel idea combines neural, statistical external features provide efficient solution problem. compute neural embedding deep recurrent model, statistical features weighted n-gram bag-of-words model handcrafted external features help feature engineering heuristics. finally, using deep neural layer features combined, thereby classifying headline-body news pair agree, disagree, discuss, unrelated. compare proposed technique current state-of-the-art models fake news challenge dataset. extensive experiments, find proposed model outperforms state-of-the-art techniques including submissions fake news challenge.",4 "corpus based enrichment germanet verb frames. lexical semantic resources, like wordnet, often used real applications natural language document processing. example, integrated germanet document suite xdoc processing german forensic autopsy protocols. addition hypernymy synonymy relation, want adapt germanet's verb frames analysis. paper outline approach domain related enrichment germanet verb frames corpus based syntactic co-occurred data analyses real documents.",4 "scalable out-of-sample extension graph embeddings using deep neural networks. several popular graph embedding techniques representation learning dimensionality reduction rely performing computationally expensive eigendecompositions derive nonlinear transformation input data space. resulting eigenvectors encode embedding coordinates training samples only, embedding novel data samples requires costly computation. paper, present method out-of-sample extension graph embeddings using deep neural networks (dnn) parametrically approximate nonlinear maps. compared traditional nonparametric out-of-sample extension methods, demonstrate dnns generalize equal better fidelity require orders magnitude less computation test time. moreover, find unsupervised pretraining dnns improves optimization larger network sizes, thus removing sensitivity model selection.",19 "multimodal named entity recognition short social media posts. introduce new task called multimodal named entity recognition (mner) noisy user-generated data tweets snapchat captions, comprise short text accompanying images. social media posts often come inconsistent incomplete syntax lexical notations limited surrounding textual contexts, bringing significant challenges ner. end, create new dataset mner called snapcaptions (snapchat image-caption pairs submitted public crowd-sourced stories fully annotated named entities). build upon state-of-the-art bi-lstm word/character based ner models 1) deep image network incorporates relevant visual context augment textual information, 2) generic modality-attention module learns attenuate irrelevant modalities amplifying informative ones extract contexts from, adaptive sample token. proposed mner model modality attention significantly outperforms state-of-the-art text-only ner models successfully leveraging provided visual contexts, opening potential applications mner myriads social media platforms.",4 "stochastic gradient estimate variance contrastive divergence persistent contrastive divergence. contrastive divergence (cd) persistent contrastive divergence (pcd) popular methods training weights restricted boltzmann machines. however, methods use approximate method sampling model distribution. side effect, approximations yield significantly different biases variances stochastic gradient estimates individual data points. well known cd yields biased gradient estimate. paper however show empirically cd lower stochastic gradient estimate variance exact sampling, mean subsequent pcd estimates higher variance exact sampling. results give one explanation finding cd used smaller minibatches higher learning rates pcd.",4 "comparison echo state network output layer classification methods noisy data. echo state networks recently developed type recurrent neural network internal layer fixed random weights, output layer trained specific data. echo state networks increasingly used process spatiotemporal data real-world settings, including speech recognition, event detection, robot control. strength echo state networks simple method used train output layer - typically collection linear readout weights found using least squares approach. although straightforward train low computational cost use, method may yield acceptable accuracy performance noisy data. study compares performance three echo state network output layer methods perform classification noisy data: using trained linear weights, using sparse trained linear weights, using trained low-rank approximations reservoir states. methods investigated experimentally synthetic natural datasets. experiments suggest using regularized least squares train linear output weights superior data low noise, using low-rank approximations may significantly improve accuracy datasets contaminated higher noise levels.",4 "virtual sensor modelling using neural networks coefficient-based adaptive weights biases search algorithm diesel engines. explosion field big data introduction stringent emission norms every three five years, automotive companies must continue enhance fuel economy ratings products, also provide valued services customers delivering engine performance health reports regular intervals. reasonable solution issues installing variety sensors engine. sensor data used develop fuel economy features directly indicate engine performance. however, mounting plethora sensors impractical cost-sensitive industry. thus, virtual sensors replace physical sensors reducing cost capturing essential engine data.",4 "gradient descent learns one-hidden-layer cnn: afraid spurious local minima. consider problem learning one-hidden-layer neural network non-overlapping convolutional layer relu activation function, i.e., $f(\mathbf{z}; \mathbf{w}, \mathbf{a}) = \sum_j a_j\sigma(\mathbf{w}^\top\mathbf{z}_j)$, convolutional weights $\mathbf{w}$ output weights $\mathbf{a}$ parameters learned. prove gaussian input $\mathbf{z}$, spurious local minimum global mininum. surprisingly, presence local minimum, starting randomly initialized weights, gradient descent weight normalization still proven recover true parameters constant probability (which boosted arbitrarily high accuracy multiple restarts). also show constant probability, procedure could also converge spurious local minimum, showing local minimum plays non-trivial role dynamics gradient descent. furthermore, quantitative analysis shows gradient descent dynamics two phases: starts slow, converges much faster several iterations.",4 "msr-net:low-light image enhancement using deep convolutional network. images captured low-light conditions usually suffer low contrast, increases difficulty subsequent computer vision tasks great extent. paper, low-light image enhancement model based convolutional neural network retinex theory proposed. firstly, show multi-scale retinex equivalent feedforward convolutional neural network different gaussian convolution kernels. motivated fact, consider convolutional neural network(msr-net) directly learns end-to-end mapping dark bright images. different fundamentally existing approaches, low-light image enhancement paper regarded machine learning problem. model, parameters optimized back-propagation, parameters traditional models depend artificial setting. experiments number challenging images reveal advantages method comparison state-of-the-art methods qualitative quantitative perspective.",4 "scalable latent tree model application health analytics. present integrated approach structure parameter estimation latent tree graphical models, nodes hidden. overall approach follows ""divide-and-conquer"" strategy learns models small groups variables iteratively merges global solution. structure learning involves combinatorial operations minimum spanning tree construction local recursive grouping; parameter learning based method moments tensor decompositions. method guaranteed correctly recover unknown tree structure model parameters low sample complexity class linear multivariate latent tree models includes discrete gaussian distributions, gaussian mixtures. bulk asynchronous parallel algorithm implemented parallel using openmp framework scales logarithmically number variables linearly dimensionality variable. experiments confirm high degree efficiency accuracy large datasets electronic health records. proposed algorithm also generates intuitive clinically meaningful disease hierarchies.",4 "travel time estimation using floating car data. report explores use machine learning techniques accurately predict travel times city streets highways using floating car data (location information user vehicles road network). aim report twofold, first present general architecture solving problem, present evaluate techniques real floating car data gathered month 5 km highway new delhi.",4 "quantifying mesoscale neuroanatomy using x-ray microtomography. methods resolving 3d microstructure brain typically start thinly slicing staining brain, imaging individual section visible light photons electrons. contrast, x-rays used image thick samples, providing rapid approach producing large 3d brain maps without sectioning. demonstrate use synchrotron x-ray microtomography ($\mu$ct) producing mesoscale $(1~\mu m^3)$ resolution brain maps millimeter-scale volumes mouse brain. introduce pipeline $\mu$ct-based brain mapping combines methods sample preparation, imaging, automated segmentation image volumes cells blood vessels, statistical analysis resulting brain structures. results demonstrate x-ray tomography promises rapid quantification large brain volumes, complementing brain mapping connectomics efforts.",16 "obda constraints effective query answering (extended version). ontology based data access (obda) users pose sparql queries ontology lies top relational datasources. queries translated on-the-fly sql queries obda systems. standard sparql-to-sql translation techniques obda often produce sql queries containing redundant joins unions, even number semantic structural optimizations. redundancies detrimental performance query answering, especially complex industrial obda scenarios large enterprise databases. address issue, introduce two novel notions obda constraints show exploit efficient query answering. conduct extensive set experiments large datasets using real world data queries, showing techniques strongly improve performance query answering orders magnitude.",4 "high-order attention models visual question answering. quest algorithms enable cognitive abilities important part machine learning. common trait many recently investigated cognitive-like tasks take account different data modalities, visual textual input. paper propose novel generally applicable form attention mechanism learns high-order correlations various data modalities. show high-order correlations effectively direct appropriate attention relevant elements different data modalities required solve joint task. demonstrate effectiveness high-order attention mechanism task visual question answering (vqa), achieve state-of-the-art performance standard vqa dataset.",4 "attention-based information fusion using multi-encoder-decoder recurrent neural networks. rising number interconnected devices sensors, modeling distributed sensor networks increasing interest. recurrent neural networks (rnn) considered particularly well suited modeling sensory streaming data. predicting future behavior, incorporating information neighboring sensor stations often beneficial. propose new rnn based architecture context specific information fusion across multiple spatially distributed sensor stations. hereby, latent representations multiple local models, modeling one sensor station, jointed weighted, according importance prediction. particular importance assessed depending current context using separate attention function. demonstrate effectiveness model three different real-world sensor network datasets.",4 "moment based estimation stochastic kronecker graph parameters. stochastic kronecker graphs supply parsimonious model large sparse real world graphs. specify distribution large random graph using three four parameters. parameters however proved difficult choose specific applications. article looks method moments estimators computationally much simpler maximum likelihood. estimators fast examples, typically yield kronecker parameters expected feature counts closer given graph get kronfit. improvement especially prominent number triangles graph.",19 "accurate facial parts localization deep learning 3d facial expression recognition. meaningful facial parts convey key cues facial action unit detection expression prediction. textured 3d face scan provide detailed 3d geometric shape 2d texture appearance cues face beneficial facial expression recognition (fer). however, accurate facial parts extraction well fusion challenging tasks. paper, novel system 3d fer designed based accurate facial parts extraction deep feature fusion facial parts. particular, textured 3d face scan firstly represented 2d texture map depth map one-to-one dense correspondence. then, facial parts texture map depth map extracted using novel 4-stage process consists facial landmark localization, facial rotation correction, facial resizing, facial parts bounding box extraction post-processing procedures. finally, deep fusion convolutional neural networks (cnns) features facial parts learned texture maps depth maps, respectively nonlinear svms used expression prediction. experiments conducted bu-3dfe database, demonstrating effectiveness combing different facial parts, texture depth cues reporting state-of-the-art results comparison existing methods setting.",4 "efficient construction local parametric reduced order models using machine learning techniques. reduced order models computationally inexpensive approximations capture important dynamical characteristics large, high-fidelity computer models physical systems. paper applies machine learning techniques improve design parametric reduced order models. specifically, machine learning used develop feasible regions parameter space admissible target accuracy achieved predefined reduced order basis, construct parametric maps, chose best two already existing bases new parameter configuration accuracy point view pre-select optimal dimension reduced basis meet desired accuracy. combining available information using bases concatenation interpolation well high-fidelity solutions interpolation able build accurate reduced order models associated new parameter settings. promising numerical results viscous burgers model illustrate potential machine learning approaches help design better reduced order models.",4 "generating news headlines recurrent neural networks. describe application encoder-decoder recurrent neural network lstm units attention generating headlines text news articles. find model quite effective concisely paraphrasing news articles. furthermore, study neural network decides input words pay attention to, specifically identify function different neurons simplified attention mechanism. interestingly, simplified attention mechanism performs better complex attention mechanism held set articles.",4 "network statistics early english syntax: structural criteria. paper includes reflection role networks study english language acquisition, well collection practical criteria annotate free-speech corpora children utterances. theoretical level, main claim paper syntactic networks interpreted outcome use syntactic machinery. thus, intrinsic features machinery accessible directly (known) network properties. rather, one see global patterns use and, thus, global view power organization underlying grammar. taking look practical issues, paper examines build net projection syntactic relations. recall that, opposed adult grammars, early-child language well-defined concept structure. overcome difficulty, develop set systematic criteria assuming constituency hierarchy grammar based lexico-thematic relations. end, obtain well defined corpora annotation enables us i) perform statistics size structures ii) build network syntactic relations perform standard measures complexity. also provide detailed example.",4 "ff planning system: fast plan generation heuristic search. describe evaluate algorithmic techniques used ff planning system. like hsp system, ff relies forward state space search, using heuristic estimates goal distances ignoring delete lists. unlike hsp's heuristic, method assume facts independent. introduce novel search strategy combines hill-climbing systematic search, show powerful heuristic information extracted used prune search space. ff successful automatic planner recent aips-2000 planning competition. review results competition, give data benchmark domains, investigate reasons runtime performance ff compared hsp.",4 "human communication systems evolve cultural selection. human communication systems, language, evolve culturally; components undergo reproduction variation. however, role selection cultural evolutionary dynamics less clear. often neutral evolution (also known 'drift') models, used explain evolution human communication systems, cultural evolution generally. account, cultural change unbiased: instance, vocabulary, baby names pottery designs found spread random copying. drift null hypothesis models cultural evolution always adequately explain empirical results. alternative models include cultural selection, assumes variant adoption biased. theoretical models human communication argue conversation interlocutors biased adopt labels aspects linguistic representation (including prosody syntax). basic alignment mechanism extended computer simulation account emergence linguistic conventions. agents biased match linguistic behavior interlocutor, single variant propagate across entire population interacting computer agents. behavior-matching account operates level individual. call conformity-biased model. different selection account, called content-biased selection, functional selection replicator selection, variant adoption depends upon intrinsic value particular variant (e.g., ease learning use). second alternative account operates level cultural variant. following boyd richerson call content-biased model. present paper tests drift model two biased selection models' ability explain spread communicative signal variants experimental micro-society.",4 "optimizing non-decomposable performance measures: tale two classes. modern classification problems frequently present mild severe label imbalance well specific requirements classification characteristics, require optimizing performance measures non-decomposable dataset, f-measure. measures spurred much interest pose specific challenges learning algorithms since non-additive nature precludes direct application well-studied large scale optimization methods stochastic gradient descent. paper reveal two large families performance measures expressed functions true positive/negative rates, indeed possible implement point stochastic updates. families consider concave pseudo-linear functions tpr, tnr cover several popularly used performance measures f-measure, g-mean h-mean. core contribution adaptive linearization scheme families, using develop optimization techniques enable truly point-based stochastic updates. concave performance measures propose spade, stochastic primal dual solver; pseudo-linear measures propose stamp, stochastic alternate maximization procedure. methods crisp convergence guarantees, demonstrate significant speedups existing methods - often order magnitude more, give similar accurate predictions test data.",19 "tap-dlnd 1.0 : corpus document level novelty detection. detecting novelty entire document artificial intelligence (ai) frontier problem widespread nlp applications, extractive document summarization, tracking development news events, predicting impact scholarly articles, etc. important though problem is, unaware benchmark document level data correctly addresses evaluation automatic novelty detection techniques classification framework. bridge gap, present resource benchmarking techniques document level novelty detection. create resource via event-specific crawling news documents across several domains periodic manner. release annotated corpus necessary statistics show use developed system problem concern.",4 "ocular dominance patterns mammalian visual cortex: wire length minimization approach. propose theory ocular dominance (od) patterns mammalian primary visual cortex. theory based premise od pattern adaptation minimize length intra-cortical wiring. thus understand existing od patterns solving wire length minimization problem. divide neurons two classes: left-eye dominated right-eye dominated. find segregation neurons monocular regions reduces wire length number connections neurons class differs class. shape regions depends relative fraction neurons two classes. numbers close find optimal od pattern consists interdigitating stripes. one class less numerous other, optimal od pattern consists patches first class neurons sea class neurons. predict transition stripes patches fraction neurons dominated ipsilateral eye 40%. prediction agrees data macaque cebus monkeys. theory applied binary cortical systems.",3 "reference-aware language models. propose general class language models treat reference explicit stochastic latent variable. architecture allows models create mentions entities attributes accessing external databases (required by, e.g., dialogue generation recipe generation) internal state (required by, e.g. language models aware coreference). facilitates incorporation information accessed predictable locations databases discourse context, even targets reference may rare words. experiments three tasks shows model variants based deterministic attention.",4 "spatio-temporal facial expression recognition using convolutional neural networks conditional random fields. automated facial expression recognition (fer) challenging task decades. many existing works use hand-crafted features lbp, hog, lpq, histogram optical flow (hof) combined classifiers support vector machines expression recognition. methods often require rigorous hyperparameter tuning achieve good results. recently deep neural networks (dnn) shown outperform traditional methods visual object recognition. paper, propose two-part network consisting dnn-based architecture followed conditional random field (crf) module facial expression recognition videos. first part captures spatial relation within facial images using convolutional layers followed three inception-resnet modules two fully-connected layers. capture temporal relation image frames, use linear chain crf second part network. evaluate proposed network three publicly available databases, viz. ck+, mmi, fera. experiments performed subject-independent cross-database manners. experimental results show cascading deep network architecture crf module considerably increases recognition facial expressions videos particular outperforms state-of-the-art methods cross-database experiments yields comparable results subject-independent experiments.",4 "supervised generative reconstruction: efficient way flexibly store recognize patterns. matching animal-like flexibility recognition ability quickly incorporate new information remains difficult. limits yet adequately addressed neural models recognition algorithms. work proposes configuration recognition maintains function conventional algorithms avoids combinatorial problems. feedforward recognition algorithms classical artificial neural networks machine learning algorithms known subject catastrophic interference forgetting. modifying learning new information (associations patterns labels) causes loss previously learned information. demonstrate using mathematical analysis supervised generative models, feedforward feedback connections, emulate feedforward algorithms yet avoid catastrophic interference forgetting. learned information generative models stored intuitive form represents fixed points solutions network moreover displays similar difficulties cognitive phenomena. brain-like capabilities limits associated generative models suggest brain may perform recognition store information using similar approach. central role recognition, progress understanding underlying principles may reveal significant insight better study integrate brain.",4 "neural machine translation benefit larger context?. propose neural machine translation architecture models surrounding text addition source sentence. models lead better performance, terms general translation quality pronoun prediction, trained small corpora, although improvement largely disappears trained larger corpus. also discover attention-based neural machine translation well suited pronoun prediction compares favorably approaches specifically designed task.",19 "similarité en intension vs en extension : à la croisée de l'informatique et du théâtre. traditional staging based formal approach similarity leaning dramaturgical ontologies instanciation variations. inspired interactive data mining, suggests different approaches, give overview computer science theater researches using computers partners actor escape priori specification roles.",4 "shape tracking occlusions via coarse-to-fine region-based sobolev descent. present method track precise shape object video based new modeling optimization new riemannian manifold parameterized regions. joint dynamic shape appearance models, template object propagated match object shape radiance next frame, advantageous methods employing global image statistics cases complex object radiance cluttered background. cases 3d object motion viewpoint change, self-occlusions dis-occlusions object prominent, current methods employing joint shape appearance models unable adapt new shape appearance information, leading inaccurate shape detection. work, model self-occlusions dis-occlusions joint shape appearance tracking framework. self-occlusions warp propagate template coupled, thus joint problem formulated. derive coarse-to-fine optimization scheme, advantageous object tracking, initially perturbs template coarse perturbations transitioning finer-scale perturbations, traversing scales, seamlessly automatically. scheme gradient descent novel infinite-dimensional riemannian manifold introduce. manifold consists planar parameterized regions, metric introduce novel sobolev-type metric defined infinitesimal vector fields regions. metric property resulting gradient descent automatically favors coarse-scale deformations (when reduce energy) moving finer-scale deformations. experiments video exhibiting occlusion/dis-occlusion, complex radiance background show occlusion/dis-occlusion modeling leads superior shape accuracy compared recent methods employing joint shape/appearance models employing global statistics.",4 "constrained fractional set programs application local clustering community detection. (constrained) minimization ratio set functions problem frequently occurring clustering community detection. optimization problems typically np-hard, one uses convex spectral relaxations practice. relaxations solved globally optimally, often loose thus lead results far away optimum. paper show every constrained minimization problem ratio non-negative set functions allows tight relaxation unconstrained continuous optimization problem. result leads flexible framework solving constrained problems network analysis. globally optimal solution resulting non-convex problem cannot guaranteed, outperform loose convex spectral relaxations large margin constrained local clustering problems.",19 "redefining part-of-speech classes distributional semantic models. paper studies word embeddings trained british national corpus interact part speech boundaries. work targets universal pos tag set, currently actively used annotation range languages. experiment training classifiers predicting pos tags words based embeddings. results show information pos affiliation contained distributional vectors allows us discover groups words distributional patterns differ words part speech. data often reveals hidden inconsistencies annotation process guidelines. time, supports notion `soft' `graded' part speech affiliations. finally, show information pos distributed among dozens vector components, limited one two features.",4 "fully adaptive algorithm pure exploration linear bandits. propose first fully-adaptive algorithm pure exploration linear bandits---the task find arm largest expected reward, depends unknown parameter linearly. existing methods partially entirely fix sequences arm selections observing rewards, method adaptively changes arm selection strategy based past observations round. show sample complexity matches achievable lower bound constant factor extreme case. furthermore, evaluate performance methods simulations based synthetic setting real-world data, method shows vast improvement existing methods.",19 "semantic texture robust dense tracking. argue robust dense slam systems make valuable use layers features coming standard cnn pyramid `semantic texture' suitable dense alignment much robust nuisance factors lighting raw rgb values. use straightforward lucas-kanade formulation image alignment, schedule iterations coarse-to-fine levels pyramid, simply replace usual image pyramid hierarchy convolutional feature maps pre-trained cnn. resulting dense alignment performance much robust lighting variations, show camera rotation tracking experiments time-lapse sequences captured many hours. looking towards future scene representation real-time visual slam, demonstrate selection using simple criteria small number total set features output cnn gives accurate much efficient tracking performance.",4 "automatic detection fake news. proliferation misleading information everyday access media outlets social media feeds, news blogs, online newspapers made challenging identify trustworthy news sources, thus increasing need computational tools able provide insights reliability online content. paper, focus automatic identification fake content online news. contribution twofold. first, introduce two novel datasets task fake news detection, covering seven different news domains. describe collection, annotation, validation process detail present several exploratory analysis identification linguistic differences fake legitimate news content. second, conduct set learning experiments build accurate fake news detectors. addition, provide comparative analyses automatic manual identification fake news.",4 "using noisy extractions discover causal knowledge. knowledge bases (kb) constructed information extraction text play important role query answering reasoning. work, study particular reasoning task, problem discovering causal relationships entities, known causal discovery. two contrasting types approaches discovering causal knowledge. one approach attempts identify causal relationships text using automatic extraction techniques, approach infers causation observational data. however, extractions alone often insufficient capture complex patterns full observational data expensive obtain. introduce probabilistic method fusing noisy extractions observational data discover causal knowledge. propose principled approach uses probabilistic soft logic (psl) framework encode well-studied constraints recover long-range patterns consistent predictions, cheaply acquired extractions provide proxy unseen observations. apply method gene regulatory networks show promise exploiting kb signals causal discovery, suggesting critical, new area research.",4 "sense embedding learning word sense induction. conventional word sense induction (wsi) methods usually represent instance discrete linguistic features cooccurrence features, train model polysemous word individually. work, propose learn sense embeddings wsi task. training stage, method induces several sense centroids (embedding) polysemous word. testing stage, method represents instance contextual vector, induces sense finding nearest sense centroid embedding space. advantages method (1) distributed sense vectors taken knowledge representations trained discriminatively, usually better performance traditional count-based distributional models, (2) general model whole vocabulary jointly trained induce sense centroids mutlitask learning framework. evaluated semeval-2010 wsi dataset, method outperforms participants recent state-of-the-art methods. verify two advantages comparing carefully designed baselines.",4 "web page categorization using artificial neural networks. web page categorization one challenging tasks world ever increasing web technologies. many ways categorization web pages based different approach features. paper proposes new dimension way categorization web pages using artificial neural network (ann) extracting features automatically. eight major categories web pages selected categorization; business & economy, education, government, entertainment, sports, news & media, job search, science. whole process proposed system done three successive stages. first stage, features automatically extracted analyzing source web pages. second stage includes fixing input values neural network; values remain 0 1. variations values affect output. finally third stage determines class certain web page eight predefined classes. stage done using back propagation algorithm artificial neural network. proposed concept facilitate web mining, retrievals information web also search engines.",4 "information-theoretic limits bayesian network structure learning. paper, study information-theoretic limits learning structure bayesian networks (bns), discrete well continuous random variables, finite number samples. show minimum number samples required procedure recover correct structure grows $\omega(m)$ $\omega(k \log + (k^2/m))$ non-sparse sparse bns respectively, $m$ number variables $k$ maximum number parents per node. provide simple recipe, based extension fano's inequality, obtain information-theoretic limits structure recovery exponential family bn. instantiate result specific conditional distributions exponential family characterize fundamental limits learning various commonly used bns, conditional probability table based networks, gaussian bns, noisy-or networks, logistic regression networks. en route obtaining main results, obtain tight bounds number sparse non-sparse essential-dags. finally, byproduct, recover information-theoretic limits sparse variable selection logistic regression.",4 "discussion: latent variable graphical model selection via convex optimization. discussion ""latent variable graphical model selection via convex optimization"" venkat chandrasekaran, pablo a. parrilo alan s. willsky [arxiv:1008.1290].",12 "nonconvex matrix factorization rank-one measurements. consider problem recovering low-rank matrices random rank-one measurements, spans numerous applications including covariance sketching, phase retrieval, quantum state tomography, learning shallow polynomial neural networks, among others. approach directly estimate low-rank factor minimizing nonconvex quadratic loss function via vanilla gradient descent, following tailored spectral initialization. true rank small, algorithm guaranteed converge ground truth (up global ambiguity) near-optimal sample complexity computational complexity. best knowledge, first guarantee achieves near-optimality metrics. particular, key enabler near-optimal computational guarantees implicit regularization phenomenon: without explicit regularization, spectral initialization gradient descent iterates automatically stay within region incoherent measurement vectors. feature allows one employ much aggressive step sizes compared ones suggested prior literature, without need sample splitting.",4 "parallel statistical multi-resolution estimation. discuss several strategies implement dykstra's projection algorithm nvidia's compute unified device architecture (cuda). dykstra's algorithm central step computationally expensive part statistical multi-resolution methods. projects given vector onto intersection convex sets. compared cpu implementation cuda implementation one order magnitude faster. speed reduce memory consumption developed new variant, call incomplete dykstra's algorithm. implemented cuda one order magnitude faster cuda implementation standard dykstra algorithm. sample application discuss using incomplete dykstra's algorithm preprocessor recently developed super-resolution optical fluctuation imaging (sofi) method (dertinger et al. 2009). show statistical multi-resolution estimation enhance resolution improvement plain sofi algorithm fourier-reweighting sofi. results compared terms power spectrum fourier ring correlation (saxton baumeister 1982). fourier ring correlation indicates resolution typical second order sofi images improved 30 per cent. results show careful parallelization dykstra's algorithm enables use large-scale statistical multi-resolution analyses.",15 "resolution unidentified words machine translation. paper presents mechanism resolving unidentified lexical units text-based machine translation (tbmt). machine translation (mt) system unlikely complete lexicon hence intense need new mechanism handle problem unidentified words. unknown words could abbreviations, names, acronyms newly introduced terms. proposed algorithm resolution unidentified words. algorithm takes discourse unit (primitive discourse) unit analysis provides real time updates lexicon. manually applied algorithm news paper fragments. along anaphora cataphora resolution, many unknown words especially names abbreviations updated lexicon.",4 "unified heuristic annotated bibliography large class earliness-tardiness scheduling problems. work proposes unified heuristic algorithm large class earliness-tardiness (e-t) scheduling problems. consider single/parallel machine e-t problems may may consider additional features idle time, setup times release dates. addition, also consider problems whose objective minimize either total (average) weighted completion time total (average) weighted flow time, arise particular cases due dates jobs either set zero associated release dates, respectively. developed local search based metaheuristic framework quite simple, time relies sophisticated procedures efficiently performing local search according characteristics problem. present efficient move evaluation approaches parallel machine problems generalize existing ones single machine problems. algorithm tested hundreds instances several e-t problems particular cases. results obtained show unified heuristic capable producing high quality solutions compared best ones available literature obtained specific methods. moreover, provide extensive annotated bibliography problems related considered work, indicate approach(es) used publication, also point characteristics problem(s) considered. beyond that, classify existing methods different categories better idea popularity type solution procedure.",4 "task specific visual saliency prediction memory augmented conditional generative adversarial networks. visual saliency patterns result variety factors aside image parsed, however existing approaches ignored these. address limitation, propose novel saliency estimation model leverages semantic modelling power conditional generative adversarial networks together memory architectures capture subject's behavioural patterns task dependent factors. make contributions aiming bridge gap bottom-up feature learning capabilities modern deep learning architectures traditional top-down hand-crafted features based methods task specific saliency modelling. conditional nature proposed framework enables us learn contextual semantics relationships among different tasks together, instead learning separately task. studies shed light novel application area generative adversarial networks, also emphasise importance task specific saliency modelling demonstrate plausibility fully capturing context via augmented memory architecture.",4 "learning flexible reusable locomotion primitives microrobot. design gaits robot locomotion daunting process requires significant expert knowledge engineering. process even challenging robots accurate physical model, compliant micro-scale robots. data-driven gait optimization provides automated alternative analytical gait design. paper, propose novel approach efficiently learn wide range locomotion tasks walking robots. approach formalizes locomotion contextual policy search task collect data, subsequently uses data learn multi-objective locomotion primitives used planning. proof-of-concept consider simulated hexapod modeled recently developed microrobot, thoroughly evaluate performance microrobot different tasks gaits. results validate proposed controller learning scheme single multi-objective locomotion tasks. moreover, experimental simulations show without prior knowledge robot used (e.g., dynamics model), approach capable learning locomotion primitives within 250 trials subsequently using successfully navigate maze.",4 "good arm identification via bandit feedback. consider novel stochastic multi-armed bandit problem called {\em good arm identification} (gai), good arm defined arm expected reward greater equal given threshold. gai pure-exploration problem single agent repeats process outputting arm soon identified good one confirming arms actually good. objective gai minimize number samples process. find gai faces new kind dilemma, {\em exploration-exploitation dilemma confidence}, different difficulty best arm identification. result, efficient design algorithms gai quite different best arm identification. derive lower bound sample complexity gai tight logarithmic factor $\mathrm{o}(\log \frac{1}{\delta})$ acceptance error rate $\delta$. also develop algorithm whose sample complexity almost matches lower bound. also confirm experimentally proposed algorithm outperforms naive algorithms synthetic settings based conventional bandit problem clinical trial researches rheumatoid arthritis.",19 "decision theoretic approach targeted advertising. simple advertising strategy used help increase sales product mail special offers selected potential customers. cost associated sending offer, optimal mailing strategy depends benefit obtained purchase offer affects buying behavior customers. paper, describe two methods partitioning potential customers groups, show perform simple cost-benefit analysis decide which, any, groups targeted. particular, consider two decision-tree learning algorithms. first ""off shelf"" algorithm used model probability groups customers buy product. second new algorithm similar first, except group, explicitly models probability purchase two mailing scenarios: (1) mail sent members group (2) mail sent members group. using data real-world advertising experiment, compare algorithms naive mail-to-all strategy.",4 "parallel training dnns natural gradient parameter averaging. describe neural-network training framework used kaldi speech recognition toolkit, geared towards training dnns large amounts training data using multiple gpu-equipped multi-core machines. order hardware-agnostic possible, needed way use multiple machines without generating excessive network traffic. method average neural network parameters periodically (typically every minute two), redistribute averaged parameters machines training. machine sees different data. itself, method work well. however, another method, approximate efficient implementation natural gradient stochastic gradient descent (ng-sgd), seems allow periodic-averaging method work well, well substantially improving convergence sgd single machine.",4 "learning causal graphs small interventions. consider problem learning causal networks interventions, intervention limited size pearl's structural equation model independent errors (sem-ie). objective minimize number experiments discover causal directions edges causal graph. previous work focused use separating systems complete graphs task. prove deterministic adaptive algorithm needs separating system order learn complete graphs worst case. addition, present novel separating system construction, whose size close optimal arguably simpler previous work combinatorics. also develop novel information theoretic lower bound number interventions applies full generality, including randomized adaptive learning algorithms. general chordal graphs, derive worst case lower bounds number interventions. building observations induced trees, give new deterministic adaptive algorithm learn directions chordal skeleton completely. worst case, achievable scheme $\alpha$-approximation algorithm $\alpha$ independence number graph. also show exist graph classes sufficient number experiments close lower bound. extreme, graph classes required number experiments multiplicatively $\alpha$ away lower bound. simulations, algorithm almost always performs close lower bound, approach based separating systems complete graphs significantly worse random chordal graphs.",4 "improving facial attribute prediction using semantic segmentation. attributes semantically meaningful characteristics whose applicability widely crosses category boundaries. particularly important describing recognizing concepts explicit training example given, \textit{e.g., zero-shot learning}. additionally, since attributes human describable, used efficient human-computer interaction. paper, propose employ semantic segmentation improve facial attribute prediction. core idea lies fact many facial attributes describe local properties. words, probability attribute appear face image far uniform spatial domain. build facial attribute prediction model jointly deep semantic segmentation network. harnesses localization cues learned semantic segmentation guide attention attribute prediction regions different attributes naturally show up. result approach, addition recognition, able localize attributes, despite merely access image level labels (weak supervision) training. evaluate proposed method celeba lfwa datasets achieve superior results prior arts. furthermore, show reverse problem, semantic face parsing improves facial attributes available. reaffirms need jointly model two interconnected tasks.",4 "disfluency detection using bidirectional lstm. introduce new approach disfluency detection using bidirectional long-short term memory neural network (blstm). addition word sequence, model takes input pattern match features developed reduce sensitivity vocabulary size training, lead improved performance word sequence alone. blstm takes advantage explicit repair states addition standard reparandum states. final output leverages integer linear programming incorporate constraints disfluency structure. experiments switchboard corpus, model achieves state-of-the-art performance standard disfluency detection task correction detection task. analysis shows model better detection non-repetition disfluencies, tend much harder detect.",4 "identity alignment noisy pixel removal. identity alignment models assume precisely annotated images manually. human labelling unrealistic large sized imagery data. detection models introduce varying amount noise hamper identity alignment performance. work, propose refine images removing undesired pixels. achieved learning eliminate less informative pixels identity alignment. end, formulate method automatically detecting removing identity class irrelevant pixels auto-detected bounding boxes. experiments validate benefits model improving identity alignment.",4 "bootstrapped adaptive threshold selection statistical model selection estimation. central goal neuroscience understand activity nervous system related features external world, features nervous system itself. common approach model neural responses weighted combination external features, vice versa. structure model weights provide insight neural representations. often, neural input-output relationships sparse, inputs contributing output. part account sparsity, structured regularizers incorporated model fitting optimization. however, imposing priors, structured regularizers make difficult interpret learned model parameters. here, investigate simple, minimally structured model estimation method accurate, unbiased estimation sparse models based bootstrapped adaptive threshold selection followed ordinary least-squares refitting (boats). extensive numerical investigations, show method often performs favorably compared l1 l2 regularizers. particular, variety model distributions noise levels, boats accurately recovers parameters sparse models, leading parsimonious explanations outputs. finally, apply method task decoding human speech production ecog recordings.",19 "highlighting objects interest image integrating saliency depth. stereo images captured primarily 3d reconstruction past. however, depth information acquired stereo also used along saliency highlight certain objects scene. approach used make still images interesting look at, highlight objects interest scene. introduce novel direction paper, discuss theoretical framework behind approach. even though use depth stereo work, approach applicable depth data acquired sensor modality. experimental results indoor outdoor scenes demonstrate benefits algorithm.",4 "embedded deep learning based word prediction. recent developments deep learning application language modeling led success tasks text processing, summarizing machine translation. however, deploying huge language models mobile device on-device keyboards poses computation bottle-neck due puny computation capacities. work propose embedded deep learning based word prediction method optimizes run-time memory also provides real time prediction environment. model size 7.40mb average prediction time 6.47 ms. improve existing methods word prediction terms key stroke savings word prediction rate.",4 "end-to-end video classification knowledge graphs. video understanding attracted much research attention especially since recent availability large-scale video benchmarks. paper, address problem multi-label video classification. first observe exists significant knowledge gap machines humans learn. is, current machine learning approaches including deep neural networks largely focus representations given data, humans often look beyond data hand leverage external knowledge make better decisions. towards narrowing gap, propose incorporate external knowledge graphs video classification. particular, unify traditional ""knowledgeless"" machine learning models knowledge graphs novel end-to-end framework. framework flexible work existing video classification algorithms including state-of-the-art deep models. finally, conduct extensive experiments largest public video dataset youtube-8m. results promising across board, improving mean average precision 2.9%.",4 "input warping bayesian optimization non-stationary functions. bayesian optimization proven highly effective methodology global optimization unknown, expensive multimodal functions. ability accurately model distributions functions critical effectiveness bayesian optimization. although gaussian processes provide flexible prior functions queried efficiently, various classes functions remain difficult model. one frequently occurring class non-stationary functions. optimization hyperparameters machine learning algorithms problem domain parameters often manually transformed priori, example optimizing ""log-space,"" mitigate effects spatially-varying length scale. develop methodology automatically learning wide family bijective transformations warpings input space using beta cumulative distribution function. extend warping framework multi-task bayesian optimization multiple tasks warped jointly stationary space. set challenging benchmark optimization tasks, observe inclusion warping greatly improves state-of-the-art, producing better results faster reliably.",19 "view-invariant recognition action style self-dissimilarity. self-similarity recently introduced measure inter-class congruence classification actions. herein, investigate dual problem intra-class dissimilarity classification action styles. introduce self-dissimilarity matrices discriminate actions performed different subjects regardless viewing direction camera parameters. investigate two frameworks using invariant style dissimilarity measures based principal component analysis (pca) fisher discriminant analysis (fda). extensive experiments performed ixmas dataset indicate remarkably good discriminant characteristics proposed invariant measures gender recognition video data.",4 "many languages, one parser. train one multilingual model dependency parsing use parse sentences several languages. parsing model uses (i) multilingual word clusters embeddings; (ii) token-level language information; (iii) language-specific features (fine-grained pos tags). input representation enables parser parse effectively multiple languages, also generalize across languages based linguistic universals typological similarities, making effective learn limited annotations. parser's performance compares favorably strong baselines range data scenarios, including target language large treebank, small treebank, treebank training.",4 "treeview: peeking deep neural networks via feature-space partitioning. advent highly predictive opaque deep learning models, become important ever understand explain predictions models. existing approaches define interpretability inverse complexity achieve interpretability cost accuracy. introduces risk producing interpretable misleading explanations. humans, prone engage kind behavior \cite{mythos}. paper, take step direction tackling problem interpretability without compromising model accuracy. propose build treeview representation complex model via hierarchical partitioning feature space, reveals iterative rejection unlikely class labels correct association predicted.",19 "correlation-based construction neighborhood edge features. motivated abstract notion low-level edge detector filters, propose simple method unsupervised feature construction based pairwise statistics features. first step, construct neighborhoods features regrouping features correlate. use subsets filters produce new neighborhood features. next, connect neighborhood features correlate, construct edge features subtracting correlated neighborhood features other. validate usefulness constructed features, ran adaboost.mh four multi-class classification problems. significant result test error 0.94% mnist algorithm essentially free image-specific priors. cifar-10 method suboptimal compared today's best deep learning techniques, nevertheless, show proposed method outperforms boosting raw pixels, also boosting haar filters.",4 "transfer deep learning low-resource chinese word segmentation novel neural network. recent studies shown effectiveness using neural networks chinese word segmentation. however, models rely large-scale data less effective low-resource datasets insufficient training data. propose transfer learning method improve low-resource word segmentation leveraging high-resource corpora. first, train teacher model high-resource corpora use learned knowledge initialize student model. second, weighted data similarity method proposed train student model low-resource data. experiment results show work significantly improves performance low-resource datasets: 2.3% 1.5% f-score pku ctb datasets. furthermore, paper achieves state-of-the-art results: 96.1%, 96.2% f-score pku ctb datasets.",4 "hybridization evolutionary algorithms. evolutionary algorithms good general problem solver suffer lack domain specific knowledge. however, problem specific knowledge added evolutionary algorithms hybridizing. interestingly, elements evolutionary algorithms hybridized. chapter, hybridization three elements evolutionary algorithms discussed: objective function, survivor selection operator parameter settings. objective function, existing heuristic function construct solution problem traditional way used. however, function embedded evolutionary algorithm serves generator new solutions. addition, objective function improved local search heuristics. new neutral selection operator developed capable deal neutral solutions, i.e. solutions different representation expose equal values objective function. aim operator directs evolutionary search new undiscovered regions search space. avoid wrong setting parameters control behavior evolutionary algorithm, self-adaptation used. finally, hybrid self-adaptive evolutionary algorithm applied two real-world np-hard problems: graph 3-coloring optimization markers clothing industry. extensive experiments shown hybridization improves results evolutionary algorithms lot. furthermore, impact particular hybridizations analyzed details well.",4 "reconstruction-based disentanglement pose-invariant face recognition. deep neural networks (dnns) trained large-scale datasets recently achieved impressive improvements face recognition. persistent challenge remains develop methods capable handling large pose variations relatively underrepresented training data. paper presents method learning feature representation invariant pose, without requiring extensive pose coverage training data. first propose generate non-frontal views single frontal face, order increase diversity training data preserving accurate facial details critical identity discrimination. next contribution seek rich embedding encodes identity features, well non-identity ones pose landmark locations. finally, propose new feature reconstruction metric learning explicitly disentangle identity pose, demanding alignment feature reconstructions various combinations identity pose features, obtained two images subject. experiments controlled in-the-wild face datasets, multipie, 300wlp profile view database cfp, show method consistently outperforms state-of-the-art, especially images large head pose variations. detail results resource referred https://sites.google.com/site/xipengcshomepage/iccv2017",4 "stacked transfer learning tropical cyclone intensity prediction. tropical cyclone wind-intensity prediction challenging task considering drastic changes climate patterns last decades. order develop robust prediction models, one needs consider different characteristics cyclones terms spatial temporal characteristics. transfer learning incorporates knowledge related source dataset compliment target datasets especially cases lack data. stacking form ensemble learning focused improving generalization recently used transfer learning problems referred transfer stacking. paper, employ transfer stacking means studying effects cyclones whereby evaluate cyclones different geographic locations helpful improving generalization performs. moreover, use conventional neural networks evaluating effects duration cyclones prediction performance. therefore, develop effective strategy evaluates relationships different types cyclones transfer learning conventional learning methods via neural networks.",4 "parallel markov chain monte carlo indian buffet process. indian buffet process based models elegant way discovering underlying features within data set, inference models slow. inferring underlying features using markov chain monte carlo either relies uncollapsed representation, leads poor mixing, collapsed representation, leads quadratic increase computational complexity. existing attempts distributing inference introduced additional approximation within inference procedure. paper present novel algorithm perform asymptotically exact parallel markov chain monte carlo inference indian buffet process models. take advantage fact features conditionally independent beta-bernoulli process. conditional independence, partition features two parts: one part containing finitely many instantiated features part containing infinite tail uninstantiated features. finite partition, parallel inference simple given instantiation features. infinite tail, performing uncollapsed mcmc leads poor mixing hence collapse features. resulting hybrid sampler, parallel, produces samples asymptotically true posterior.",19 "bubbleview: interface crowdsourcing image importance maps tracking visual attention. paper, present bubbleview, alternative methodology eye tracking using discrete mouse clicks measure information people consciously choose examine. bubbleview mouse-contingent, moving-window interface participants presented series blurred images click reveal ""bubbles"" - small, circular areas image original resolution, similar confined area focus like eye fovea. across 10 experiments 28 different parameter combinations, evaluated bubbleview variety image types: information visualizations, natural images, static webpages, graphic designs, compared clicks eye fixations collected eye-trackers controlled lab settings. found bubbleview clicks (i) successfully approximate eye fixations different images, (ii) used rank image design elements importance. bubbleview designed collect clicks static images, works best defined tasks describing content information visualization measuring image importance. bubbleview data cleaner consistent related methodologies use continuous mouse movements. analyses validate use mouse-contingent, moving-window methodologies approximating eye fixations different image task types.",4 "towards ontology-driven blockchain design supply chain provenance. interesting research problem age big data determining provenance. granular evaluation provenance physical goods--e.g. tracking ingredients pharmaceutical demonstrating authenticity luxury goods--has often possible today's items produced transported complex, inter-organizational, often internationally-spanning supply chains. recent adoption internet things blockchain technologies give promise better supply chain provenance. particularly interested blockchain many favoured use cases blockchain provenance tracking. also interested applying ontologies work done knowledge provenance, traceability, food provenance using ontologies. paper, make case ontologies contribute blockchain design. support case, analyze traceability ontology translate representations smart contracts execute provenance trace enforce traceability constraints ethereum blockchain platform.",4 "deep semantic classification 3d lidar data. robots expected operate autonomously dynamic environments. understanding underlying dynamic characteristics objects key enabler achieving goal. paper, propose method pointwise semantic classification 3d lidar data three classes: non-movable, movable dynamic. concentrate understanding specific semantics characterize important information required autonomous system. non-movable points scene belong unchanging segments environment, whereas remaining classes corresponds changing parts scene. difference movable dynamic class motion state. dynamic points perceived moving, whereas movable objects move, perceived static. learn distinction movable non-movable points environment, introduce approach based deep neural network detecting dynamic points, estimate pointwise motion. propose bayes filter framework combining learned semantic cues motion cues infer required semantic classification. extensive experiments, compare approach methods standard benchmark dataset report competitive results comparison existing state-of-the-art. furthermore, show improvement classification points combining semantic cues retrieved neural network motion cues.",4 "boosting neural machine translation. training efficiency one main problems neural machine translation (nmt). deep networks need large data well many training iterations achieve state-of-the-art performance. results high computation cost, slowing research industrialisation. paper, propose alleviate problem several training methods based data boosting bootstrap modifications neural network. imitates learning process humans, typically spend time learning ""difficult"" concepts easier ones. experiment english-french translation task showing accuracy improvements 1.63 bleu saving 20% training time.",4 "predictive coding-based deep dynamic neural network visuomotor learning. study presents dynamic neural network model based predictive coding framework perceiving predicting dynamic visuo-proprioceptive patterns. previous study [1], shown deep dynamic neural network model able coordinate visual perception action generation seamless manner. current study, extended previous model predictive coding framework endow model capability perceiving predicting dynamic visuo-proprioceptive patterns well capability inferring intention behind perceived visuomotor information minimizing prediction error. set synthetic experiments conducted robot learned imitate gestures another robot simulation environment. experimental results showed given intention states, model able mentally simulate possible incoming dynamic visuo-proprioceptive patterns top-down process without inputs external environment. moreover, results highlighted role minimizing prediction error inferring underlying intention perceived visuo-proprioceptive patterns, supporting predictive coding account mirror neuron systems. results also revealed minimizing prediction error one modality induced recall corresponding representation another modality acquired consolidative learning raw-level visuo-proprioceptive patterns.",4 "enhanced deep residual networks single image super-resolution. recent research super-resolution progressed development deep convolutional neural networks (dcnn). particular, residual learning techniques exhibit improved performance. paper, develop enhanced deep super-resolution network (edsr) performance exceeding current state-of-the-art sr methods. significant performance improvement model due optimization removing unnecessary modules conventional residual networks. performance improved expanding model size stabilize training procedure. also propose new multi-scale deep super-resolution system (mdsr) training method, reconstruct high-resolution images different upscaling factors single model. proposed methods show superior performance state-of-the-art methods benchmark datasets prove excellence winning ntire2017 super-resolution challenge.",4 "content-based image retrieval based late fusion binary local descriptors. one challenges content-based image retrieval (cbir) reduce semantic gaps low-level features high-level semantic concepts. cbir, images represented feature space performance cbir depends type selected feature representation. late fusion also known visual words integration applied enhance performance image retrieval. recent advances image retrieval diverted focus research towards use binary descriptors reported computationally efficient. paper, aim investigate late fusion fast retina keypoint (freak) scale invariant feature transform (sift). late fusion binary local descriptor selected among binary descriptors, freak shown good results classification-based problems sift robust translation, scaling, rotation small distortions. late fusion freak sift integrates performance feature descriptors effective image retrieval. experimental results comparisons show proposed late fusion enhances performances image retrieval.",4 "approximating continuous functions relu nets minimal width. article concerns expressive power depth deep feed-forward neural nets relu activations. specifically, answer following question: fixed $d_{in}\geq 1,$ minimal width $w$ neural nets relu activations, input dimension $d_{in}$, hidden layer widths $w,$ arbitrary depth approximate continuous, real-valued function $d_{in}$ variables arbitrarily well? turns minimal width exactly equal $d_{in}+1.$ is, hidden layer widths bounded $d_{in}$, even infinite depth limit, relu nets express limited class functions, and, hand, continuous function $d_{in}$-dimensional unit cube approximated arbitrary precision relu nets hidden layers width exactly $d_{in}+1.$ construction fact shows continuous function $f:[0,1]^{d_{in}}\to\mathbb r^{d_{out}}$ approximated net width $d_{in}+d_{out}$. obtain quantitative depth estimates approximation terms modulus continuity $f$.",19 "modular analysis adaptive (non-)convex optimization: optimism, composite objectives, variational bounds. recently, much work done extending scope online learning incremental stochastic optimization algorithms. paper contribute effort two ways: first, based new regret decomposition generalization bregman divergences, provide self-contained, modular analysis two workhorses online learning: (general) adaptive versions mirror descent (md) follow-the-regularized-leader (ftrl) algorithms. analysis done extra care introduce assumptions needed proofs allows combine, straightforward way, different algorithmic ideas (e.g., adaptivity, optimism, implicit updates) learning settings (e.g., strongly convex composite objectives). way able reprove, extend refine large body literature, keeping proofs concise. second contribution byproduct careful analysis: present algorithms improved variational bounds smooth, composite objectives, including new family optimistic md algorithms one projection step per round. furthermore, provide simple extension adaptive regret bounds practically relevant non-convex problem settings essentially extra effort.",4 "neural network approach context-sensitive generation conversational responses. present novel response generation system trained end end large quantities unstructured twitter conversations. neural network architecture used address sparsity issues arise integrating contextual information classic statistical models, allowing system take account previous dialog utterances. dynamic-context generative models show consistent gains context-sensitive non-context-sensitive machine translation information retrieval baselines.",4 "capacity trainability recurrent neural networks. two potential bottlenecks expressiveness recurrent neural networks (rnns) ability store information task parameters, store information input history units. show experimentally common rnn architectures achieve nearly per-task per-unit capacity bounds careful training, variety tasks stacking depths. store amount task information linear number parameters, approximately 5 bits per parameter. additionally store approximately one real number input history per hidden unit. find several tasks per-task parameter capacity bound determines performance. results suggest many previous results comparing rnn architectures driven primarily differences training effectiveness, rather differences capacity. supporting observation, compare training difficulty several architectures, show vanilla rnns far difficult train, yet slightly higher capacity. finally, propose two novel rnn architectures, one easier train lstm gru deeply stacked architectures.",19 "simple pairs points digital spaces. topology-preserving transformations digital spaces contracting simple pairs points. transformations digital spaces preserving local global topology play important role thinning, skeletonization simplification digital images. present paper, introduce study contractions simple pair points based notions digital contractible space contractible transformations digital spaces. show contraction simple pair points preserves local global topology digital space. relying obtained results, study properties digital manifolds. particular, show digital n-manifold transformed compressed form minimal number points sequential contractions simple pairs. key words: graph, digital space, contraction, splitting, simple pair, homotopy, thinning",4 "cross-media similarity evaluation web image retrieval wild. order retrieve unlabeled images textual queries, cross-media similarity computation key ingredient. although novel methods continuously introduced, little done evaluate methods together large-scale query log analysis. consequently, far methods brought us answering real-user queries unclear. given baseline methods compute cross-media similarity using relatively simple text/image matching, much progress advanced models made also unclear. paper takes pragmatic approach answering two questions. queries automatically categorized according proposed query visualness measure, later connected evaluation multiple cross-media similarity models three test sets. connection reveals success state-of-the-art mainly attributed good performance visual-oriented queries, queries account small part real-user queries. quantify current progress, propose simple text2image method, representing novel test query set images selected large-scale query log. consequently, computing cross-media similarity test query given image boils comparing visual similarity given image selected images. image retrieval experiments challenging clickture dataset show proposed text2image compares favorably recent deep learning based alternatives.",4 "koniq-10k: towards ecologically valid large-scale iqa database. main challenge applying state-of-the-art deep learning methods predict image quality in-the-wild relatively small size existing quality scored datasets. reason lack larger datasets massive resources required generating diverse publishable content. present new systematic scalable approach create large-scale, authentic diverse image datasets image quality assessment (iqa). show built iqa database, koniq-10k, consisting 10,073 images, performed large scale crowdsourcing experiments order obtain reliable quality ratings 1,467 crowd workers (1.2 million ratings). argue ecological validity analyzing diversity dataset, comparing state-of-the-art iqa databases, checking reliability user studies.",4 "momentum stochastic momentum stochastic gradient, newton, proximal point subspace descent methods. paper study several classes stochastic optimization algorithms enriched heavy ball momentum. among methods studied are: stochastic gradient descent, stochastic newton, stochastic proximal point stochastic dual subspace ascent. first time momentum variants several methods studied. choose perform analysis setting methods equivalent. prove global nonassymptotic linear convergence rates methods various measures success, including primal function values, primal iterates (in l2 sense), dual function values. also show primal iterates converge accelerated linear rate l1 sense. first time linear rate shown stochastic heavy ball method (i.e., stochastic gradient descent method momentum). somewhat weaker conditions, establish sublinear convergence rate cesaro averages primal iterates. moreover, propose novel concept, call stochastic momentum, aimed decreasing cost performing momentum step. prove linear convergence several stochastic methods stochastic momentum, show sparse data regimes sufficiently small momentum parameters, methods enjoy better overall complexity methods deterministic momentum. finally, perform extensive numerical testing artificial real datasets, including data coming average consensus problems.",12 "generalized topic modeling. recently significant activity developing algorithms provable guarantees topic modeling. standard topic models, topic (such sports, business, politics) viewed probability distribution $\vec a_i$ words, document generated first selecting mixture $\vec w$ topics, generating words i.i.d. associated mixture $a{\vec w}$. given large collection documents, goal recover topic vectors correctly classify new documents according topic mixture. work consider broad generalization framework words longer assumed drawn i.i.d. instead topic complex distribution sequences paragraphs. since one could hope even represent distribution general (even paragraphs given using natural feature representation), aim instead directly learn document classifier. is, aim learn predictor given new document, accurately predicts topic mixture, without learning distributions explicitly. present several natural conditions one efficiently discuss issues noise tolerance sample complexity model. generally, model viewed generalization multi-view co-training setting machine learning.",4 "oriented straight line segment algebra: qualitative spatial reasoning oriented objects. nearly 15 years ago, set qualitative spatial relations oriented straight line segments (dipoles) suggested schlieder. work received substantial interest amongst qualitative spatial reasoning community. however, turned difficult establish sound constraint calculus based relations. paper, present results new investigation dipole constraint calculi uses algebraic methods derive sound results composition relations properties dipole calculi. results based condensed semantics dipole relations. contrast points normally used, dipoles extended intrinsic direction. features important properties natural objects. allows straightforward representation prototypical reasoning tasks spatial agents. example, show generate survey knowledge local observations street network. example illustrates fast constraint-based reasoning capabilities dipole calculus. integrate results two reasoning tools publicly available.",4 "memory enriched big bang big crunch optimization algorithm data clustering. cluster analysis plays important role decision making process many knowledge-based systems. exist wide variety different approaches clustering applications including heuristic techniques, probabilistic models, traditional hierarchical algorithms. paper, novel heuristic approach based big bang-big crunch algorithm proposed clustering problems. proposed method takes advantage heuristic nature alleviate typical clustering algorithms k-means, also benefits memory based scheme compared similar heuristic techniques. furthermore, performance proposed algorithm investigated based several benchmark test functions well well-known datasets. experimental results show significant superiority proposed method similar algorithms.",4 "causal decision trees. uncovering causal relationships data major objective data analytics. causal relationships normally discovered designed experiments, e.g. randomised controlled trials, which, however expensive infeasible conducted many cases. causal relationships also found using well designed observational studies, require domain experts' knowledge process normally time consuming. hence need scalable automated methods causal relationship exploration data. classification methods fast could practical substitutes finding causal signals data. however, classification methods designed causal discovery classification method may find false causal signals miss true ones. paper, develop causal decision tree nodes causal interpretations. method follows well established causal inference framework makes use classic statistical test. method practical finding causal signals large data sets.",4 "one class classifier based framework using svdd : application imbalanced geological dataset. evaluation hydrocarbon reservoir requires classification petrophysical properties available dataset. however, characterization reservoir attributes difficult due nonlinear heterogeneous nature subsurface physical properties. context, present study proposes generalized one class classification framework based support vector data description (svdd) classify reservoir characteristic water saturation two classes (class high class low) four logs namely gamma ray, neutron porosity, bulk density, p sonic using imbalanced dataset. comparison carried among proposed framework different supervised classification algorithms terms g metric means execution time. experimental results show proposed framework outperformed classifiers terms performance evaluators. envisaged classification analysis performed study useful reservoir modeling.",4 "recurrent neural network postfilters statistical parametric speech synthesis. last two years, numerous papers looked using deep neural networks replace acoustic model traditional statistical parametric speech synthesis. however, far less attention paid approaches like dnn-based postfiltering dnns work conjunction traditional acoustic models. paper, investigate use recurrent neural networks potential postfilter synthesis. explore possibility replacing existing postfilters, well highlight ease arbitrary new features added input postfilter. also tried novel approach jointly training classification regression tree postfilter, rather traditional approach training independently.",4 "direct learning rank rerank. learning-to-rank techniques proven extremely useful prioritization problems, rank items order estimated probabilities, dedicate limited resources top-ranked items. work exposes serious problem state learning-to-rank algorithms, based convex proxies lead poor approximations. discuss possibility ""exact"" reranking algorithms based mathematical programming. prove relaxed version ""exact"" problem optimal solution, provide empirical analysis.",19 "linearized kernel dictionary learning. paper present new approach incorporating kernels dictionary learning. kernel k-svd algorithm (kksvd), introduced recently, shows improvement classification performance, relation linear counterpart k-svd. however, algorithm requires storage handling large kernel matrix, leads high computational cost, also limiting use setups small number training examples. address problems combining two ideas: first approximate kernel matrix using cleverly sampled subset columns using nystr\""{o}m method; secondly, wish avoid using matrix altogether, decompose svd form new ""virtual samples,"" linear dictionary learning employed. method, termed ""linearized kernel dictionary learning"" (lkdl) seamlessly applied pre-processing stage top efficient off-the-shelf dictionary learning scheme, effectively ""kernelizing"" it. demonstrate effectiveness method several tasks supervised unsupervised classification show efficiency proposed scheme, easy integration performance boosting properties.",4 "precision recall range-based anomaly detection. classical anomaly detection principally concerned point-based anomalies, anomalies occur single data point. paper, present new mathematical model express range-based anomalies, anomalies occur range (or period) time.",4 "application-oriented terminology evaluation: case back-of-the book indexes. paper addresses problem computational terminology evaluation per se specific application context. paper describes evaluation procedure used assess validity overall indexing approach quality inddoc indexing tool. even user-oriented extended evaluation irreplaceable, argue early evaluations possible useful development guidance.",4 "weather perception: joint data association, tracking, classification autonomous ground vehicles. novel probabilistic perception algorithm presented real-time joint solution data association, object tracking, object classification autonomous ground vehicle all-weather conditions. presented algorithm extends rao-blackwellized particle filter originally built particle filter data association kalman filter multi-object tracking (miller et al. 2011a) also include multiple model tracking classification. additionally state-of-the-art vision detection algorithm includes heading information autonomous ground vehicle (agv) applications implemented. cornell's agv darpa urban challenge upgraded used experimentally examine state-of-the-art vision algorithms complement replace lidar radar sensors. sensor algorithm performance adverse weather lighting conditions tested. experimental evaluation demonstrates robust all-weather data association, tracking, classification camera, lidar, radar sensors complement inside joint probabilistic perception algorithm.",4 "redundancy logic i: cnf propositional formulae. knowledge base redundant contains parts inferred rest it. study problem checking whether cnf formula (a set clauses) redundant, is, contains clauses derived ones. cnf formula made irredundant deleting clauses: results irredundant equivalent subset (i.e.s.) study complexity related problems: verification, checking existence i.e.s. given size, checking necessary possible presence clauses i.e.s.'s, uniqueness. also consider problem redundancy different definitions equivalence.",4 "novel tuneable method skin detection based hybrid color space color statistical features. skin detection one important primary stages image processing applications face detection human tracking. far, many approaches proposed done case. near methods tried find best match intensity distribution skin pixels based popular color spaces rgb, cmyk ycbcr. results show methods cannot provide accurate approach every kinds skin. paper, approach proposed solve problem using statistical features technique. approach including two stages. first one, pure skin statistical features extracted second stage, skin pixels detected using hsv ycbcr color spaces. result part, proposed approach applied fei database accuracy rate reached 99.25 + 0.2. proposed method applied complex background database accuracy rate obtained 95.40+0.31%. proposed approach used kinds skin using train stage main advantages it. low noise sensitivity low computational complexity advantages.",4 "universal variance reduction-based catalyst nonconvex low-rank matrix recovery. propose generic framework based new stochastic variance-reduced gradient descent algorithm accelerating nonconvex low-rank matrix recovery. starting appropriate initial estimator, proposed algorithm performs projected gradient descent based novel semi-stochastic gradient specifically designed low-rank matrix recovery. based upon mild restricted strong convexity smoothness conditions, derive projected notion restricted lipschitz continuous gradient property, prove algorithm enjoys linear convergence rate unknown low-rank matrix improved computational complexity. moreover, algorithm employed noiseless noisy observations, optimal sample complexity minimax optimal statistical rate attained respectively. illustrate superiority generic framework several specific examples, theoretically experimentally.",19 "informed heuristics guiding stem-and-cycle ejection chains. state art local search traveling salesman problem dominated ejection chain methods utilising stem-and-cycle reference structure. though effective algorithms employ little information successor selection strategy, typically seeking minimise cost move. propose alternative approach inspired ai literature show admissible heuristic used guide successor selection. undertake empirical analysis demonstrate technique often produces better results less informed strategies albeit cost running higher polynomial time.",4 finding approximate local minima faster gradient descent. design non-convex second-order optimization algorithm guaranteed return approximate local minimum time scales linearly underlying dimension number training examples. time complexity algorithm find approximate local minimum even faster gradient descent find critical point. algorithm applies general class optimization problems including training neural network non-convex objectives arising machine learning.,12 "projection onto probability simplex: efficient algorithm simple proof, application. provide elementary proof simple, efficient algorithm computing euclidean projection point onto probability simplex. also show application laplacian k-modes clustering.",4 "real-time web scale event summarization using sequential decision making. present system based sequential decision making online summarization massive document streams, found web. given event interest (e.g. ""boston marathon bombing""), system able filter stream relevance produce series short text updates describing event unfolds time. unlike previous work, approach able jointly model relevance, comprehensiveness, novelty, timeliness required time-sensitive queries. demonstrate 28.3% improvement summary f1 43.8% improvement time-sensitive f1 metrics.",4 "automated identification trampoline skills using computer vision extracted pose estimation. novel method identify trampoline skills using single video camera proposed herein. conventional computer vision techniques used identification, estimation, tracking gymnast's body video recording routine. frame, open source convolutional neural network used estimate pose athlete's body. body orientation joint angle estimates extracted pose estimates. trajectories angle estimates time compared labelled reference skills. nearest neighbour classifier utilising mean squared error distance metric used identify skill performed. dataset containing 714 skill examples 20 distinct skills performed adult male female gymnasts recorded used evaluation system. system found achieve skill identification accuracy 80.7% dataset.",4 "clustering side information: probabilistic model deterministic algorithm. paper, propose model-based clustering method (tvclust) robustly incorporates noisy side information soft-constraints aims seek consensus side information observed data. method based nonparametric bayesian hierarchical model combines probabilistic model data instance one side-information. efficient gibbs sampling algorithm proposed posterior inference. using small-variance asymptotics probabilistic model, derive new deterministic clustering algorithm (rdp-means). viewed extension k-means allows inclusion side information additional property number clusters need specified priori. empirical studies carried compare work many constrained clustering algorithms literature variety data sets variety conditions using noisy side information erroneous k values. results experiments show strong results probabilistic deterministic approaches conditions compared algorithms literature.",19 algorithm missing values imputation categorical data use association rules. paper presents algorithm missing values imputation categorical data. algorithm based using association rules presented three variants. experimental shows better accuracy missing values imputation using algorithm using common attribute value.,4 "automatic recognition mammal genera camera-trap images using multi-layer robust principal component analysis mixture neural networks. segmentation classification animals camera-trap images due conditions images taken, difficult task. work presents method classifying segmenting mammal genera camera-trap images. method uses multi-layer robust principal component analysis (rpca) segmenting, convolutional neural networks (cnns) extracting features, least absolute shrinkage selection operator (lasso) selecting features, artificial neural networks (anns) support vector machines (svm) classifying mammal genera present colombian forest. evaluated method camera-trap images alexander von humboldt biological resources research institute. obtained accuracy 92.65% classifying 8 mammal genera false positive (fp) class, using automatic-segmented images. hand, reached 90.32% accuracy classifying 10 mammal genera, using ground-truth images only. unlike almost previous works, confront animal segmentation genera classification camera-trap recognition. method shows new approach toward fully-automatic detection animals camera-trap images.",4 "visual-hint boundary segment algorithm image segmentation. image segmentation active research topic image analysis area. currently, image segmentation algorithms designed based idea images partitioned set regions preserving homogeneous intra-regions inhomogeneous inter-regions. however, human visual intuition always follow pattern. new image segmentation method named visual-hint boundary segment (vhbs) introduced, consistent human perceptions. vhbs abides two visual hint rules based human perceptions: (i) global scale boundaries tend real boundaries objects; (ii) two adjacent regions quite different colors textures tend result real boundaries them. demonstrated experiments that, compared traditional image segmentation method, vhbs better performance also preserves higher computational efficiency.",4 "hyperparameter optimization boosting classifying facial expressions: good ""null"" model be?. one goals icml workshop representation learning establish benchmark scores new data set labeled facial expressions. paper presents performance ""null"" model consisting convolutions random weights, pca, pooling, normalization, linear readout. approach focused hyperparameter optimization rather novel model components. facial expression recognition challenge held kaggle website, hyperparameter optimization approach achieved score 60% accuracy test data. paper also introduces new ensemble construction variant combines hyperparameter optimization construction ensembles. algorithm constructed ensemble four models scored 65.5% accuracy. scores rank 12th 5th respectively among 56 challenge participants. worth noting approach developed prior release data set, applied without modification; strong competition performance suggests tpe hyperparameter optimization algorithm domain expertise encoded null model generalize new image classification data sets.",4 "vector space model cognitive space text classification. era digitization, knowing user's sociolect aspects become essential features build user specific recommendation systems. sociolect aspects could found mining user's language sharing form text social media reviews. paper describes experiment performed pan author profiling 2017 shared task. objective task find sociolect aspects users tweets. sociolect aspects considered experiment user's gender native language information. user's tweets written different language native language represented document - term matrix document frequency constraint. classification done using support vector machine taking gender native language target classes. experiment attains average accuracy 73.42% gender prediction 76.26% native language identification task.",4 "adversarial extreme multi-label classification. goal extreme multi-label classification learn classifier assign small subset relevant labels instance extremely large set target labels. datasets extreme classification exhibit long tail labels small number positive training instances. work, pose learning task extreme classification large number tail-labels learning presence adversarial perturbations. view motivates robust optimization framework equivalence corresponding regularized objective. proposed robustness framework, demonstrate efficacy hamming loss tail-label detection extreme classification. equivalent regularized objective, combination proximal gradient based optimization, performs better state-of-the-art methods propensity scored versions precision@k ndcg@k(upto 20% relative improvement pfastrexml - leading tree-based approach 60% relative improvement sleec - leading label-embedding approach). furthermore, also highlight sub-optimality sparse solver widely used package large-scale linear classification, interesting right. also investigate spectral properties label graphs providing novel insights towards understanding conditions governing performance hamming loss based one-vs-rest scheme vis-\`a-vis label embedding methods.",19 "paradigm shift: detecting human rights violations web images. growing presence devices carrying digital cameras, mobile phones tablets, combined ever improving internet networks enabled ordinary citizens, victims human rights abuse, participants armed conflicts, protests, disaster situations capture share via social media networks images videos specific events. paper discusses potential images human rights context including opportunities challenges present. study demonstrates real-world images capacity contribute complementary data operational human rights monitoring efforts combined novel computer vision approaches. analysis concluded arguing images used effectively detect identify human rights violations rights advocates, greater attention gathering task-specific visual concepts large-scale web images required.",4 "dendritic error backpropagation deep cortical microcircuits. animal behaviour depends learning associate sensory stimuli desired motor command. understanding brain orchestrates necessary synaptic modifications across different brain areas remained longstanding puzzle. here, introduce multi-area neuronal network model synaptic plasticity continuously adapts network towards global desired output. model synaptic learning driven local dendritic prediction error arises failure predict top-down input given bottom-up activities. errors occur apical dendrites pyramidal neurons long-range excitatory feedback local inhibitory predictions integrated. local inhibition fails match excitatory feedback error occurs triggers plasticity bottom-up synapses basal dendrites pyramidal neurons. demonstrate learning capabilities model number tasks show approximates classical error backpropagation algorithm. finally, complementing cortical circuit disinhibitory mechanism enables attention-like stimulus denoising generation. framework makes several experimental predictions function dendritic integration cortical microcircuits, consistent recent observations cross-area learning, suggests biological implementation deep learning.",16 "content selection data-to-text systems: survey. data-to-text systems powerful generating reports data automatically thus simplify presentation complex data. rather presenting data using visualisation techniques, data-to-text systems use natural (human) language, common way human-human communication. addition, data-to-text systems adapt output content users' preferences, background interests therefore pleasant users interact with. content selection important part every data-to-text system, module determines available information conveyed user. survey initially introduces field data-to-text generation, describes general data-to-text system architecture reviews state-of-the-art content selection methods. finally, provides recommendations choosing approach discusses opportunities future research.",4 "logic reasoning evidence. introduce logic reasoning evidence, essentially views evidence function prior beliefs (before making observation) posterior beliefs (after making observation). provide sound complete axiomatization logic, consider complexity decision problem. although reasoning logic mainly propositional, allow variables representing numbers quantification them. expressive power seems necessary capture important properties evidence",4 "methodology analyze accuracy 3d objects reconstructed collaborative robot based monocular lsd-slam. slam systems mainly applied robot navigation research feasibility motion planning slam tasks like bin-picking, scarce. accurate 3d reconstruction objects environments important planning motion computing optimal gripper pose grasp objects. work, propose methods analyze accuracy 3d environment reconstructed using lsd-slam system monocular camera mounted onto gripper collaborative robot. discuss propose solution pose space conversion problem. finally, present several criteria analyze 3d reconstruction accuracy. could used guidelines improve accuracy 3d reconstructions monocular lsd-slam slam based solutions.",4 "absent-minded driver problem redux. paper reconsiders problem absent-minded driver must choose alternatives different payoff imperfect recall varying degrees knowledge system. classical absent-minded driver problem represents case limited information bearing general area communication learning, social choice, mechanism design, auctions, theories knowledge, belief, rational agency. within framework extensive games, problem applications many artificial intelligence scenarios. obvious performance agent improves information available increases. shown non-uniform assignment strategy successive choices better fixed probability strategy. consider classical quantum approaches problem. argue superior performance quantum decisions access entanglement cannot fairly compared classical algorithm. cognitive systems agents taken access quantum resources, quantum mechanical basis, leveraged superior performance.",4 "hypothesis testing using pairwise distances associated kernels (with appendix). provide unifying framework linking two classes statistics used two-sample independence testing: one hand, energy distances distance covariances statistics literature; other, distances embeddings distributions reproducing kernel hilbert spaces (rkhs), established machine learning. equivalence holds energy distances computed semimetrics negative type, case kernel may defined rkhs distance distributions corresponds exactly energy distance. determine class probability distributions kernels induced semimetrics characteristic (that is, embeddings distributions rkhs injective). finally, investigate performance family kernels two-sample independence tests: show particular energy distance commonly employed statistics one member parametric family kernels, choices family yield powerful tests.",4 "deep cnn based feature extractor text-prompted speaker recognition. deep learning still common tool speaker verification field. study deep convolutional neural network performance text-prompted speaker verification task. prompted passphrase segmented word states - i.e. digits -to test digit utterance separately. train single high-level feature extractor states use cosine similarity metric scoring. key feature network max-feature-map activation function, acts embedded feature selector. using multitask learning scheme train high-level feature extractor able surpass classic baseline systems terms quality achieved impressive results novice approach, getting 2.85% eer rsr2015 evaluation set. fusion proposed baseline systems improves result.",6 "interactive graphics visually diagnosing forest classifiers r. paper describes structuring data constructing plots explore forest classification models interactively. forest classifier example ensemble, produced bagging multiple trees. process bagging combining results multiple trees, produces numerous diagnostics which, interactive graphics, provide lot insight class structure high dimensions. various aspects explored paper, assess model complexity, individual model contributions, variable importance dimension reduction, uncertainty prediction associated individual observations. ideas applied random forest algorithm, projection pursuit forest, could broadly applied bagged ensembles. interactive graphics built r, using ggplot2, plotly, shiny packages.",19 "deep adversarial neural decoding. here, present novel approach solve problem reconstructing perceived stimuli brain responses combining probabilistic inference deep learning. approach first inverts linear transformation latent features brain responses maximum posteriori estimation inverts nonlinear transformation perceived stimuli latent features adversarial training convolutional neural networks. test approach functional magnetic resonance imaging experiment show generate state-of-the-art reconstructions perceived faces brain activations.",16 "rows vs columns linear systems equations - randomized kaczmarz coordinate descent?. paper randomized iterative algorithms solving linear system equations $x \beta = y$ different settings. recent interest topic reignited strohmer vershynin (2009) proved linear convergence rate randomized kaczmarz (rk) algorithm works rows $x$ (data points). following that, leventhal lewis (2010) proved linear convergence randomized coordinate descent (rcd) algorithm works columns $x$ (features). aim paper simplify understanding two algorithms, establish direct relationships (though rk often compared stochastic gradient descent), examine algorithmic commonalities tradeoffs involved working rows columns. also discuss kernel ridge regression present kaczmarz-style algorithm works data points advantage solving problem without ever storing forming gram matrix, one recognized problems encountered scaling kernelized methods.",12 "examining cooperation visual dialog models. work propose blackbox intervention method visual dialog models, aim assessing contribution individual linguistic visual components. concretely, conduct structured randomized interventions aim impair individual component model, observe changes task performance. reproduce state-of-the-art visual dialog model demonstrate methodology yields surprising insights, namely dialog image information minimal contributions task performance. intervention method presented applied sanity check strength robustness component visual dialog systems.",4 "learning optimal forecast aggregation partial evidence environments. consider forecast aggregation problem repeated settings, forecasts done binary event. period multiple experts provide forecasts event. goal aggregator aggregate forecasts subjective accurate forecast. assume experts bayesian; namely share common prior, expert exposed evidence, expert applies bayes rule deduce forecast. aggregator ignorant respect information structure (i.e., distribution evidence) according experts make prediction. aggregator observes experts' forecasts only. end period actual state realized. focus question whether aggregator learn aggregate optimally forecasts experts, optimal aggregation bayesian aggregation takes account information (evidence) system. consider class partial evidence information structures, expert exposed different subset conditionally independent signals. main results positive; show optimal aggregation learned polynomial time quite wide range instances partial evidence environments. provide tight characterization instances learning possible impossible.",4 "solving cooperative reliability games. cooperative games model allocation profit joint actions, following considerations stability fairness. propose reliability extension games, agents may fail participate game. reliability extension, agent ""survives"" certain probability, coalition's value probability surviving members would winning coalition base game. study prominent solution concepts games, showing approximate shapley value compute core games agent types. also show applying reliability extension may stabilize game, making core non-empty even base game empty core.",4 "machine learning drug overdose surveillance. describe two recently proposed machine learning approaches discovering emerging trends fatal accidental drug overdoses. gaussian process subset scan enables early detection emerging patterns spatio-temporal data, accounting non-iid nature data fact detecting subtle patterns requires integration information across multiple spatial areas multiple time steps. apply approach 17 years county-aggregated data monthly opioid overdose deaths new york city metropolitan area, showing clear advantages utility discovered patterns compared typical anomaly detection approaches. detect characterize emerging overdose patterns differentially affect subpopulation data, including geographic, demographic, behavioral patterns (e.g., combinations drugs involved), apply multidimensional tensor scan 8 years case-level overdose data allegheny county, pa. discover previously unidentified overdose patterns reveal unusual demographic clusters, show impacts drug legislation, demonstrate potential early detection targeted intervention. approaches early detection overdose patterns inform prevention response efforts, well understanding effects policy changes.",4 "modeling vagueness uncertainty data-to-text systems fuzzy sets. vagueness uncertainty management counted among one challenges remain unresolved systems generate texts non-linguistic data, known data-to-text systems. last decade, work fuzzy linguistic summarization description data raised interest using fuzzy sets model manage imprecision human language data-to-text systems. however, despite research direction, actual clear discussion justification fuzzy sets contribute data-to-text modeling vagueness uncertainty words expressions. paper intends bridge gap answering following questions: vagueness mean fuzzy sets theory? vagueness mean data-to-text contexts? ways fuzzy sets theory contribute improve data-to-text systems? challenges researchers disciplines need address successful integration fuzzy sets data-to-text systems? cases use fuzzy sets avoided d2t? this, review discuss state art vagueness modeling natural language generation data-to-text, describe potential actual usages fuzzy sets data-to-text contexts, provide additional insights engineering data-to-text systems make use fuzzy set-based techniques.",4 "user modelling avoiding overfitting interactive knowledge elicitation prediction. human-in-the-loop machine learning, user provides information beyond training data. many algorithms user interfaces designed optimize facilitate human--machine interaction; however, fewer studies addressed potential defects designs cause. effective interaction often requires exposing user training data statistics. design system critical, lead double use data overfitting, user reinforces noisy patterns data. propose user modelling methodology, assuming simple rational behaviour, correct problem. show, user study 48 participants, method improves predictive performance sparse linear regression sentiment analysis task, graded user knowledge feature relevance elicited. believe key idea inferring user knowledge probabilistic user models general applicability guarding overfitting improving interactive machine learning.",4 "classification constrained dimensionality reduction. dimensionality reduction topic recent interest. paper, present classification constrained dimensionality reduction (ccdr) algorithm account label information. algorithm account multiple classes well semi-supervised setting. present out-of-sample expressions labeled unlabeled data. unlabeled data, introduce method embedding new point preprocessing classifier. labeled data, introduce method improves embedding training phase using out-of-sample extension. investigate classification performance using ccdr algorithm hyper-spectral satellite imagery data. demonstrate performance gain local global classifiers demonstrate 10% improvement $k$-nearest neighbors algorithm performance. present connection intrinsic dimension estimation optimal embedding dimension obtained using ccdr algorithm.",19 learning elm network weights using linear discriminant analysis. present alternative pseudo-inverse method determining hidden output weight values extreme learning machines performing classification tasks. method based linear discriminant analysis provides bayes optimal single point estimates weight values.,4 "characterisation (sub)sequential rational functions general class monoids. technical report describe general class monoids (sub)sequential rational characterised terms congruence relation flavour myhill-nerode relation. class monoids consider described terms natural algebraic axioms, contains free monoids, groups, tropical monoid, closed cartesian.",4 "bidirectional-convolutional lstm based spectral-spatial feature learning hyperspectral image classification. paper proposes novel deep learning framework named bidirectional-convolutional long short term memory (bi-clstm) network automatically learn spectral-spatial feature hyperspectral images (hsis). network, issue spectral feature extraction considered sequence learning problem, recurrent connection operator across spectral domain used address it. meanwhile, inspired widely used convolutional neural network (cnn), convolution operator across spatial domain incorporated network extract spatial feature. besides, sufficiently capture spectral information, bidirectional recurrent connection proposed. classification phase, learned features concatenated vector fed softmax classifier via fully-connected operator. validate effectiveness proposed bi-clstm framework, compare several state-of-the-art methods, including cnn framework, three widely used hsis. obtained results show bi-clstm improve classification performance compared methods.",4 "flexible iterative framework consensus clustering. novel framework consensus clustering presented ability determine number clusters final solution using multiple algorithms. consensus similarity matrix formed ensemble using multiple algorithms several values k. variety dimension reduction techniques clustering algorithms considered analysis. noisy high-dimensional data, iterative technique presented refine consensus matrix way encourages algorithms agree upon common solution. utilize theory nearly uncoupled markov chains determine number, k , clusters dataset considering random walk graph defined consensus matrix. eigenvalues associated transition probability matrix used determine number clusters. method succeeds determining number clusters many datasets previous methods fail. every considered dataset, consensus method provides final result accuracy well average individual algorithms.",19 "watergan: unsupervised generative network enable real-time color correction monocular underwater images. paper reports watergan, generative adversarial network (gan) generating realistic underwater images in-air image depth pairings unsupervised pipeline used color correction monocular underwater images. cameras onboard autonomous remotely operated vehicles capture high resolution images map seafloor, however, underwater image formation subject complex process light propagation water column. raw images retrieved characteristically different images taken air due effects absorption scattering, cause attenuation light different rates different wavelengths. physical process well described theoretically, model depends many parameters intrinsic water column well objects scene. factors make recovery parameters difficult without simplifying assumptions field calibration, hence, restoration underwater images non-trivial problem. deep learning demonstrated great success modeling complex nonlinear systems requires large amount training data, difficult compile deep sea environments. using watergan, generate large training dataset paired imagery, raw underwater true color in-air, well depth data. data serves input novel end-to-end network color correction monocular underwater images. due depth-dependent water column effects inherent underwater environments, show end-to-end network implicitly learns coarse depth estimate underwater scene monocular underwater images. proposed pipeline validated testing real data collected pure water tank underwater surveys field testing. source code made publicly available sample datasets pretrained models.",4 "mag: multilingual, knowledge-base agnostic deterministic entity linking approach. entity linking recently subject significant body research. currently, best performing approaches rely trained mono-lingual models. porting approaches languages consequently difficult endeavor requires corresponding training data retraining models. address drawback presenting novel multilingual, knowledge-based agnostic deterministic approach entity linking, dubbed mag. mag based combination context-based retrieval structured knowledge bases graph algorithms. evaluate mag 23 data sets 7 languages. results show best approach trained english datasets (pboh) achieves micro f-measure 4 times worse datasets languages. mag, hand, achieves state-of-the-art performance english datasets reaches micro f-measure 0.6 higher pboh non-english languages.",4 "face synthesis (fasy) system generation face image human description. paper aims generating new face based human like description using new concept. fasy (face synthesis) system face database retrieval new face generation system development. one main features generation requested face found existing database, allows continuous growing database also.",4 "watersheds edge node weighted graphs ""par l'exemple"". watersheds defined node edge weighted graphs. show identical: edge (resp.\ node) weighted graph exists node (resp. edge) weighted graph minima catchment basin.",4 "humans deep networks largely agree kinds variation make object recognition harder. view-invariant object recognition challenging problem, attracted much attention among psychology, neuroscience, computer vision communities. humans notoriously good it, even variations presumably difficult handle others (e.g. 3d rotations). humans thought solve problem hierarchical processing along ventral stream, progressively extracts invariant visual features. feed-forward architecture inspired new generation bio-inspired computer vision systems called deep convolutional neural networks (dcnn), currently best algorithms object recognition natural images. here, first time, systematically compared human feed-forward vision dcnns view-invariant object recognition using images controlling kinds transformation well magnitude. used four object categories images rendered 3d computer models. total, 89 human subjects participated 10 experiments discriminate two four categories rapid presentation backward masking. also tested two recent dcnns tasks. found humans dcnns largely agreed relative difficulties kind variation: rotation depth far hardest transformation handle, followed scale, rotation plane, finally position. suggests humans recognize objects mainly 2d template matching, rather constructing 3d object models, dcnns unreasonable models human feed-forward vision. also, results show variation levels rotation depth scale strongly modulate humans' dcnns' recognition performances. thus argue variations controlled image datasets used vision research.",4 "joint estimation multiple graphical models high dimensional time series. manuscript consider problem jointly estimating multiple graphical models high dimensions. assume data collected n subjects, consists possibly dependent observations. graphical models subjects vary, assumed change smoothly corresponding measure closeness subjects. propose kernel based method jointly estimating graphical models. theoretically, double asymptotic framework, (t,n) dimension increase, provide explicit rate convergence parameter estimation. characterizes strength one borrow across different individuals impact data dependence parameter estimation. empirically, experiments synthetic real resting state functional magnetic resonance imaging (rs-fmri) data illustrate effectiveness proposed method.",19 "robust optical flow estimation rainy scenes. optical flow estimation rainy scenes challenging due background degradation introduced rain streaks rain accumulation effects scene. rain accumulation effect refers poor visibility remote objects due intense rainfall. existing optical flow methods erroneous applied rain sequences conventional brightness constancy constraint (bcc) gradient constancy constraint (gcc) generally break situation. based observation rgb color channels receive raindrop radiance equally, introduce residue channel new data constraint reduce effect rain streaks. handle rain accumulation, method decomposes image piecewise-smooth background layer high-frequency detail layer. also enforces bcc background layer only. results synthetic dataset real images show algorithm outperforms existing methods different types rain sequences. knowledge, first optical flow method specifically dealing rain.",4 "quantum mechanical approach modelling reliability sensor reports. dempster-shafer evidence theory wildly applied multi-sensor data fusion. however, lots uncertainty interference exist practical situation, especially battle field. still open issue model reliability sensor reports. many methods proposed based relationship among collected data. letter, proposed quantum mechanical approach evaluate reliability sensor reports, based properties sensor itself. proposed method used modify combining evidences.",4 "weakly supervised plda training. plda popular normalization approach i-vector model, delivered state-of-the-art performance speaker verification. however, plda training requires large amount labelled development data, highly expensive cases. present cheap plda training approach, assumes speakers session easily separated, speakers different sessions simply different. results `weak labels' fully accurate cheap, leading weak plda training. experimental results real-life large-scale telephony customer service achieves demonstrated weak training offer good performance human-labelled data limited. interestingly, weak training employed discriminative adaptation approach, efficient prevailing unsupervised method human-labelled data insufficient.",4 "non-simplifying graph rewriting termination. far, large amount work natural language processing (nlp) rely trees core mathematical structure represent linguistic informations (e.g. chomsky's work). however, linguistic phenomena cope properly trees. former paper, showed benefit encoding linguistic structures graphs using graph rewriting rules compute structures. justified linguistic considerations, graph rewriting characterized two features: first, node creation along computations second, non-local edge modifications. hypotheses, show uniform termination undecidable non-uniform termination decidable. describe two termination techniques based weights give complexity bound derivation length rewriting system.",4 "compressed sensing using generative models. goal compressed sensing estimate vector underdetermined system noisy linear measurements, making use prior knowledge structure vectors relevant domain. almost results literature, structure represented sparsity well-chosen basis. show achieve guarantees similar standard compressed sensing without employing sparsity all. instead, suppose vectors lie near range generative model $g: \mathbb{r}^k \to \mathbb{r}^n$. main theorem that, $g$ $l$-lipschitz, roughly $o(k \log l)$ random gaussian measurements suffice $\ell_2/\ell_2$ recovery guarantee. demonstrate results using generative models published variational autoencoder generative adversarial networks. method use $5$-$10$x fewer measurements lasso accuracy.",19 "properties sparse distributed representations application hierarchical temporal memory. empirical evidence demonstrates every region neocortex represents information using sparse activity patterns. paper examines sparse distributed representations (sdrs), primary information representation strategy hierarchical temporal memory (htm) systems neocortex. derive number properties core scaling, robustness, generalization. use theory provide practical guidelines illustrate power sdrs basis htm. goal help create unified mathematical practical framework sdrs relates cortical function.",16 "aorta segmentation stent simulation. simulation arterial stenting procedures prior intervention allows appropriate device selection well highlights potential complications. end, present framework facilitating virtual aortic stenting contrast computer tomography (ct) scan. specifically, present method lumen outer wall segmentation may employed determining appropriateness intervention well selection localization device. challenging recovery outer wall based novel minimal closure tracking algorithm. aortic segmentation method validated 3000 multiplanar reformatting (mpr) planes 50 ct angiography data sets yielding dice similarity coefficient (dsc) 90.67%.",4 "framework genetic algorithms based hadoop. genetic algorithms (gas) powerful metaheuristic techniques mostly used many real-world applications. sequential execution gas requires considerable computational power time resources. nevertheless, gas naturally parallel accessing parallel platform cloud easy cheap. apache hadoop one common services used parallel applications. however, using hadoop develop parallel version gas simple without facing inner workings. even though sequential frameworks gas already exist, framework supporting development ga applications executed parallel. paper described framework parallel gas hadoop platform, following paradigm mapreduce. main purpose framework allow user focus aspects ga specific problem addressed, sure task going correctly executed cloud good performance. framework also exploited develop application feature subset selection problem. preliminary analysis performance developed ga application performed using three datasets shown promising performance.",4 "speaker recognition children's speech. paper presents results speaker recognition (sr) children's speech, using ogi kids corpus gmm-ubm gmm-svm sr systems. regions spectrum containing important speaker information children identified conducting sr experiments 21 frequency bands. adults, spectrum split four regions, first (containing primary vocal tract resonance information) third (corresponding high frequency speech sounds) useful sr. however, frequencies regions occur 11% 38% higher children. also noted subband sr rates lower younger children. finally results presented sr experiments identify child class (30 children, similar age) school (288 children, varying ages). class performance depends age, accuracy varying 90% young children 99% older children. identification rate achieved child school 81%.",4 "deep learning-based food calorie estimation method dietary assessment. obesity treatment requires obese patients record food intakes per day. computer vision introduced estimate calories food images. order increase accuracy detection reduce error volume estimation food calorie estimation, present calorie estimation method paper. estimate calorie food, top view side view needed. faster r-cnn used detect food calibration object. grabcut algorithm used get food's contour. volume estimated food corresponding object. finally estimate food's calorie. experiment results show estimation method effective.",4 "representing human machine dictionaries markup languages. chapter present main issues representing machine readable dictionaries xml, particular according text encoding dictionary (tei) guidelines.",4 "deep multi-view learning stochastic decorrelation loss. multi-view learning aims learn embedding space multiple views either maximally correlated cross-view recognition, decorrelated latent factor disentanglement. key challenge deep multi-view representation learning scalability. correlate decorrelate multi-view signals, covariance whole training set computed fit well mini-batch based training strategy, moreover (de)correlation done way free svd-based computation order scale contemporary layer sizes. work, unified approach proposed efficient scalable deep multi-view learning. specifically, mini-batch based stochastic decorrelation loss (sdl) proposed applied network layer provide soft decorrelation layer's activations. reveals connection deep multi-view learning models deep canonical correlation analysis (dcca) factorisation autoencoder (fae), allows easily implemented. show sdl superior decorrelation losses terms efficacy scalability.",4 "application multiview techniques nhanes dataset. disease prediction classification using health datasets involve using well-known predictors associated disease features models. study considers multiple data components individual's health, using relationship variables generate features may improve performance disease classification models. order capture information different aspects data, project uses multiview learning approach, using canonical correlation analysis (cca), technique finds projections maximum correlations two data views. data categories collected nhanes survey (1999-2014) used views learn multiview representations. usefulness representations demonstrated applying features diabetes classification task.",4 "tumor classification segmentation mr brain images. diagnosis segmentation tumors using medical diagnostic tool challenging due varying nature pathology. magnetic reso- nance imaging (mri) established diagnostic tool various diseases disorders plays major role clinical neuro-diagnosis. supplementing technique automated classification segmentation tools gaining importance, reduce errors time needed make conclusive diagnosis. paper simple three-step algorithm proposed; (1) identification patients present tumors, (2) automatic selection abnormal slices patients, (3) segmentation detection tumor. features extracted using discrete wavelet transform normalized images classified support vector machine (for step (1)) random forest (for step (2)). 400 subjects divided 3:1 ratio training test overlap. study novel terms use data, employed entire t2 weighted slices single image classification unique combination contralateral approach patch thresholding segmentation, require training set template used segmentation studies. using proposed method, tumors segmented accurately classification accuracy 95% 100% specificity 90% sensitivity.",4 "multispectral image denoising optimized vector non-local mean filter. nowadays, many applications rely images high quality ensure good performance conducting tasks. however, noise goes objective unavoidable issue applications. therefore, essential develop techniques attenuate impact noise, maintaining integrity relevant information images. propose work extend application non-local means filter (nlm) vector case apply denoising multispectral images. objective benefit additional information brought multispectral imaging systems. nlm filter exploits redundancy information image remove noise. restored pixel weighted average pixels image. contribution, propose optimization framework dynamically fine tune nlm filter parameters attenuate computational complexity considering pixels similar computing restored pixel. filter parameters optimized using stein's unbiased risk estimator (sure) rather using ad hoc means. experiments conducted multispectral images corrupted additive white gaussian noise psnr similarity comparison approaches provided illustrate efficiency approach terms denoising performance computation complexity.",4 "variation word frequencies russian literary texts. study variation word frequencies russian literary texts. findings indicate standard deviation word's frequency across texts depends average frequency according power law exponent $0.62,$ showing rarer words relatively larger degree frequency volatility (i.e., ""burstiness""). several latent factors models estimated investigate structure word frequency distribution. dependence word's frequency volatility average frequency explained asymmetry distribution latent factors.",4 "semantic classifier approach document classification. paper propose new document classification method, bridging discrepancies (so-called semantic gap) training set application sets textual data. demonstrate superiority classical text classification approaches, including traditional classifier ensembles. method consists combining document categorization technique single classifier classifier ensemble (semcom algorithm - committee semantic categorizer).",4 "building fast compact convolutional neural networks offline handwritten chinese character recognition. like problems computer vision, offline handwritten chinese character recognition (hccr) achieved impressive results using convolutional neural network (cnn)-based methods. however, larger deeper networks needed deliver state-of-the-art results domain. networks intuitively appear incur high computational cost, require storage large number parameters, renders unfeasible deployment portable devices. solve problem, propose global supervised low-rank expansion (gslre) method adaptive drop-weight (adw) technique solve problems speed storage capacity. design nine-layer cnn hccr consisting 3,755 classes, devise algorithm reduce networks computational cost nine times compress network 1/18 original size baseline model, 0.21% drop accuracy. tests, proposed algorithm surpassed best single-network performance reported thus far literature requiring 2.3 mb storage. furthermore, integrated effective forward implementation, recognition offline character image took 9.7 ms cpu. compared state-of-the-art cnn model hccr, approach approximately 30 times faster, yet 10 times cost efficient.",4 "tree memory networks modelling long-term temporal dependencies. domain sequence modelling, recurrent neural networks (rnn) capable achieving impressive results variety application areas including visual question answering, part-of-speech tagging machine translation. however success modelling short term dependencies successfully transitioned application areas trajectory prediction, require capturing short term long term relationships. paper, propose tree memory network (tmn) modelling long term short term relationships sequence-to-sequence mapping problems. proposed network architecture composed input module, controller memory module. contrast related literature, models memory sequence historical states, model memory recursive tree structure. structure effectively captures temporal dependencies across short term long term sequences using hierarchical structure. demonstrate effectiveness flexibility proposed tmn two practical problems, aircraft trajectory modelling pedestrian trajectory modelling surveillance setting, cases outperform current state-of-the-art. furthermore, perform depth analysis evolution memory module content time provide visual evidence proposed tmn able map long term short term relationships efficiently via hierarchical structure.",4 "semantic-aware grad-gan virtual-to-real urban scene adaption. recent advances vision tasks (e.g., segmentation) highly depend availability large-scale real-world image annotations obtained cumbersome human labors. moreover, perception performance often drops significantly new scenarios, due poor generalization capability models trained limited biased annotations. work, resort transfer knowledge automatically rendered scene annotations virtual-world facilitate real-world visual tasks. although virtual-world annotations ideally diverse unlimited, discrepant data distributions virtual real-world make challenging knowledge transferring. thus propose novel semantic-aware grad-gan (sg-gan) perform virtual-to-real domain adaption ability retaining vital semantic information. beyond simple holistic color/texture transformation achieved prior works, sg-gan successfully personalizes appearance adaption semantic region order preserve key characteristic better recognition. presents two main contributions traditional gans: 1) soft gradient-sensitive objective keeping semantic boundaries; 2) semantic-aware discriminator validating fidelity personalized adaptions respect semantic region. qualitative quantitative experiments demonstrate superiority sg-gan scene adaption state-of-the-art gans. evaluations semantic segmentation cityscapes show using adapted virtual images sg-gan dramatically improves segmentation performance original virtual data. release code https://github.com/peilun-li/sg-gan.",4 "approximated robust principal component analysis improved general scene background subtraction. research reported paper addresses fundamental task separation locally moving deforming image areas static globally moving background. builds latest developments field robust principal component analysis, specifically, recently reported practical solutions long-standing problem recovering low-rank sparse parts large matrix made sum two components. article addresses critical issues including: embedding global motion parameters matrix decomposition model, i.e., estimation global motion parameters simultaneously foreground/background separation task, considering matrix block-sparsity rather generic matrix sparsity natural feature video processing applications, attenuating background ghosting effects foreground subtracted, critically providing extremely efficient algorithm solve low-rank/sparse matrix decomposition task. first aspect important background/foreground separation generic video sequences background usually obeys global displacements originated camera motion capturing process. second aspect exploits fact video processing applications sparse matrix particular structure, non-zero matrix entries randomly distributed build small blocks within sparse matrix. next feature proposed approach addresses removal ghosting effects originated foreground silhouettes lack information occluded background regions image. finally, proposed model also tackles algorithmic complexity introducing extremely efficient ""svd-free"" technique applied background/foreground separation tasks conventional video processing.",4 "supervised hashing using graph cuts boosted decision trees. embedding image features binary hamming space improve speed accuracy large-scale query-by-example image retrieval systems. supervised hashing aims map original features compact binary codes manner preserves label-based similarities original data. existing approaches apply single form hash function, optimization process typically deeply coupled specific form. tight coupling restricts flexibility methods, result complex optimization problems difficult solve. work proffer flexible yet simple framework able accommodate different types loss functions hash functions. proposed framework allows number existing approaches hashing placed context, simplifies development new problem-specific hashing methods. framework decomposes two steps: binary code (hash bits) learning, hash function learning. first step typically formulated binary quadratic problem, second step accomplished training standard binary classifiers. solving large-scale binary code inference, show ensure binary quadratic problems submodular efficient graph cut approach used. achieve efficiency well efficacy large-scale high-dimensional data, propose use boosted decision trees hash functions, nonlinear, highly descriptive, fast train evaluate. experiments demonstrate proposed method significantly outperforms state-of-the-art methods, especially high-dimensional data.",4 "unsupervised activity discovery characterization event-streams. present framework discover characterize different classes everyday activities event-streams. begin representing activities bags event n-grams. allows us analyze global structural information activities, using local event statistics. demonstrate maximal cliques undirected edge-weighted graph activities, used activity-class discovery unsupervised manner. show modeling activity variable length markov process, used discover recurrent event-motifs characterize discovered activity-classes. present results extensive data-sets, collected multiple active environments, show competence generalizability proposed framework.",4 "hyperspectral image superresolution: edge-preserving convex formulation. hyperspectral remote sensing images (hsis) characterized low spatial resolution high spectral resolution, whereas multispectral images (msis) characterized low spectral high spatial resolutions. complementary characteristics stimulated active research inference images high spatial spectral resolutions hsi-msi pairs. paper, formulate data fusion problem minimization convex objective function containing two data-fitting terms edge-preserving regularizer. data-fitting terms quadratic account blur, different spatial resolutions, additive noise; regularizer, form vector total variation, promotes aligned discontinuities across reconstructed hyperspectral bands. optimization described rather hard, owing non-diagonalizable linear operators, non-quadratic non-smooth nature regularizer, large size image inferred. tackle difficulties tailoring split augmented lagrangian shrinkage algorithm (salsa)---an instance alternating direction method multipliers (admm)---to optimization problem. using convenient variable splitting exploiting fact hsis generally ""live"" low-dimensional subspace, obtain effective algorithm yields state-of-the-art results, illustrated experiments.",4 "optimal transport maps distribution preserving operations latent spaces generative models. generative models variational auto encoders (vaes) generative adversarial networks (gans) typically trained fixed prior distribution latent space, uniform gaussian. trained model obtained, one sample generator various forms exploration understanding, interpolating two samples, sampling vicinity sample exploring differences pair samples applied third sample. paper, show latent space operations used literature far induce distribution mismatch resulting outputs prior distribution model trained on. address this, propose use distribution matching transport maps ensure latent space operations preserve prior distribution, minimally modifying original operation. experimental results validate proposed operations give higher quality samples compared original operations.",4 "closed-form marginal likelihood gamma-poisson factorization. present novel understandings gamma-poisson (gap) model, probabilistic matrix factorization model count data. show gap rewritten free score/activation matrix. gives us new insights estimation topic/dictionary matrix maximum marginal likelihood estimation. particular, explains robustness estimator over-specified values factorization rank particular ability automatically prune spurious dictionary columns, empirically observed previous work. marginalization activation matrix leads turn new monte-carlo expectation-maximization algorithm favorable properties.",19 "labelbank: revisiting global perspectives semantic segmentation. semantic segmentation requires detailed labeling image pixels object category. information derived local image patches necessary describe detailed shape individual objects. however, information ambiguous result noisy labels. global inference image content instead capture general semantic concepts present. advocate holistic inference image concepts provides valuable information detailed pixel labeling. propose generic framework leverage holistic information form labelbank pixel-level segmentation. show ability framework improve semantic segmentation performance variety settings. learn models extracting holistic labelbank visual cues, attributes, and/or textual descriptions. demonstrate improvements semantic segmentation accuracy standard datasets across range state-of-the-art segmentation architectures holistic inference approaches.",4 "multi-parametric solution-path algorithm instance-weighted support vector machines. instance-weighted variant support vector machine (svm) attracted considerable attention recently since useful various machine learning tasks non-stationary data analysis, heteroscedastic data modeling, transfer learning, learning rank, transduction. important challenge scenarios overcome computational bottleneck---instance weights often change dynamically adaptively, thus weighted svm solutions must repeatedly computed. paper, develop algorithm efficiently exactly update weighted svm solutions arbitrary change instance weights. technically, contribution regarded extension conventional solution-path algorithm single regularization parameter multiple instance-weight parameters. however, extension gives rise significant problem breakpoints (at solution path turns) identified high-dimensional space. facilitate this, introduce parametric representation instance weights. also provide geometric interpretation weight space using notion critical region: polyhedron current affine solution remains optimal. find breakpoints intersections solution path boundaries polyhedrons. extensive experiments various practical applications, demonstrate usefulness proposed algorithm.",4 "algorithm runtime prediction: methods & evaluation. perhaps surprisingly, possible predict long algorithm take run previously unseen input, using machine learning techniques build model algorithm's runtime function problem-specific instance features. models important applications algorithm analysis, portfolio-based algorithm selection, automatic configuration parameterized algorithms. past decade, wide variety techniques studied building models. here, describe extensions improvements existing models, new families models, -- perhaps importantly -- much thorough treatment algorithm parameters model inputs. also comprehensively describe new existing features predicting algorithm runtime propositional satisfiability (sat), travelling salesperson (tsp) mixed integer programming (mip) problems. evaluate innovations largest empirical analysis kind, comparing wide range runtime modelling techniques literature. experiments consider 11 algorithms 35 instance distributions; also span wide range sat, mip, tsp instances, least structured generated uniformly random structured emerged real industrial applications. overall, demonstrate new models yield substantially better runtime predictions previous approaches terms generalization new problem instances, new algorithms parameterized space, simultaneously.",4 "robust principal component analysis graphs. principal component analysis (pca) widely used tool linear dimensionality reduction clustering. still highly sensitive outliers scale well respect number data samples. robust pca solves first issue sparse penalty term. second issue handled matrix factorization model, however non-convex. besides, pca based clustering also enhanced using graph data similarity. article, introduce new model called ""robust pca graphs"" incorporates spectral graph regularization robust pca framework. proposed model benefits 1) robustness principal components occlusions missing values, 2) enhanced low-rank recovery, 3) improved clustering property due graph smoothness assumption low-rank matrix, 4) convexity resulting optimization problem. extensive experiments 8 benchmark, 3 video 2 artificial datasets corruptions clearly reveal model outperforms 10 state-of-the-art models clustering low-rank recovery tasks.",4 mesa: maximum entropy simulated annealing. probabilistic reasoning systems combine different probabilistic rules probabilistic facts arrive desired probability values consequences. paper describe mesa-algorithm (maximum entropy simulated annealing) derives joint distribution variables propositions. takes account reliability probability values resolve conflicts contradictory statements. joint distribution represented terms marginal distributions therefore allows process large inference networks determine desired probability values high precision. procedure derives maximum entropy distribution subject given constraints. applied inference networks arbitrary topology may extended number directions.,4 "that's fact: distinguishing factual emotional argumentation online dialogue. investigate characteristics factual emotional argumentation styles observed online debates. using annotated set ""factual"" ""feeling"" debate forum posts, extract patterns highly correlated factual emotional arguments, apply bootstrapping methodology find new patterns larger pool unannotated forum posts. process automatically produces large set patterns representing linguistic expressions highly correlated factual emotional language. finally, analyze discriminating patterns better understand defining characteristics factual emotional arguments.",4 "chinese text wild. introduce chinese text wild, large dataset chinese text street view images. optical character recognition (ocr) document images well studied many commercial tools available, detection recognition text natural images still challenging problem, especially complicated character sets chinese text. lack training data always problem, especially deep learning methods require massive training data. paper provide details newly created dataset chinese text 1 million chinese characters annotated experts 30 thousand street view images. challenging dataset good diversity. contains planar text, raised text, text cities, text rural areas, text poor illumination, distant text, partially occluded text, etc. character dataset, annotation includes underlying character, bounding box, 6 attributes. attributes indicate whether complex background, whether raised, whether handwritten printed, etc. large size diversity dataset make suitable training robust neural networks various tasks, particularly detection recognition. give baseline results using several state-of-the-art networks, including alexnet, overfeat, google inception resnet character recognition, yolov2 character detection images. overall google inception best performance recognition 80.5% top-1 accuracy, yolov2 achieves map 71.0% detection. dataset, source code trained models publicly available website.",4 "variable computation recurrent neural networks. recurrent neural networks (rnns) used extensively increasing success model various types sequential data. much progress achieved devising recurrent units architectures flexibility capture complex statistics data, long range dependency localized attention phenomena. however, many sequential data (such video, speech language) highly variable information flow, recurrent models still consume input features constant rate perform constant number computations per time step, detrimental speed model capacity. paper, explore modification existing recurrent units allows learn vary amount computation perform step, without prior knowledge sequence's time structure. show experimentally models require fewer operations, also lead better performance overall evaluation tasks.",19 "learning avoid errors gans manipulating input spaces. despite recent advances, large scale visual artifacts still common occurrence images generated gans. previous work focused improving generator's capability accurately imitate data distribution $p_{data}$. paper, instead explore methods enable gans actively avoid errors manipulating input space. core idea apply small changes noise vector order shift away areas input space tend result errors. derive three different architectures idea. main one consists simple residual module leads significantly less visual artifacts, slightly decreasing diversity. module trivial add existing gans costs almost zero computation memory.",19 "adaboost forward stagewise regression first-order convex optimization methods. boosting methods highly popular effective supervised learning methods combine weak learners single accurate model good statistical performance. paper, analyze two well-known boosting methods, adaboost incremental forward stagewise regression (fs$_\varepsilon$), establishing precise connections mirror descent algorithm, first-order method convex optimization. consequence connections obtain novel computational guarantees boosting methods. particular, characterize convergence bounds adaboost, related margin log-exponential loss function, step-size sequence. furthermore, paper presents, first time, precise computational complexity results fs$_\varepsilon$.",19 "discriminative kalman filter nonlinear non-gaussian sequential bayesian filtering. kalman filter (kf) used variety applications computing posterior distribution latent states state space model. model requires linear relationship states observations. extensions kalman filter proposed incorporate linear approximations nonlinear models, extended kalman filter (ekf) unscented kalman filter (ukf). however, argue cases dimensionality observed variables greatly exceeds dimensionality state variables, model $p(\text{state}|\text{observation})$ proves easier learn accurate latent space estimation. derive validate call discriminative kalman filter (dkf): closed-form discriminative version bayesian filtering readily incorporates off-the-shelf discriminative learning techniques. further, demonstrate given mild assumptions, highly non-linear models $p(\text{state}|\text{observation})$ specified. motivate validate synthetic datasets neural decoding non-human primates, showing substantial increases decoding performance versus standard kalman filter.",19 "learning fusing multimodal features multi-task facial computing. propose deep learning-based feature fusion approach facial computing including face recognition well gender, race age detection. instead training single classifier face images classify based features person whose face appears image, first train four different classifiers classifying face images based race, age, gender identification (id). multi-task features extracted trained models cross-task-feature training conducted shows value fusing multimodal features extracted multi-tasks. found features trained one task used related tasks. interestingly, features trained task classes (e.g. id) used another task fewer classes (e.g. race) outperforms features trained task itself. final feature fusion performed combining four types features extracted images four classifiers. feature fusion approach improves classifications accuracy 7.2%, 20.1%, 22.2%, 21.8% margin, respectively, id, age, race gender recognition, results single classifiers trained individual features. proposed method applied applications different types data features extracted.",4 "learned deep representations action recognition?. success deep models led deployment areas computer vision, increasingly important understand representations work capturing. paper, shed light deep spatiotemporal representations visualizing two-stream models learned order recognize actions video. show local detectors appearance motion objects arise form distributed representations recognizing human actions. key observations include following. first, cross-stream fusion enables learning true spatiotemporal features rather simply separate appearance motion features. second, networks learn local representations highly class specific, also generic representations serve range classes. third, throughout hierarchy network, features become abstract show increasing invariance aspects data unimportant desired distinctions (e.g. motion patterns across various speeds). fourth, visualizations used shed light learned representations, also reveal idiosyncracies training data explain failure cases system.",4 "video event recognition surveillance applications (versa). versa provides general-purpose framework defining recognizing events live recorded surveillance video streams. approach event recognition versa using declarative logic language define spatial temporal relationships characterize given event activity. requires definition certain fundamental spatial temporal relationships high-level syntax specifying frame templates query parameters. although handling uncertainty current versa implementation simplistic, language architecture amenable extending using fuzzy logic similar approaches. versa's high-level architecture designed work xml-based, services- oriented environments. versa thought subscribing xml annotations streamed lower-level video analytics service provides basic entity detection, labeling, tracking. one many versa event monitors could thus analyze video streams provide alerts certain events detected.",4 "spectral experts estimating mixtures linear regressions. discriminative latent-variable models typically learned using em gradient-based optimization, suffer local optima. paper, develop new computationally efficient provably consistent estimator mixture linear regressions, simple instance discriminative latent-variable model. approach relies low-rank linear regression recover symmetric tensor, factorized parameters using tensor power method. prove rates convergence estimator provide empirical evaluation illustrating strengths relative local optimization (em).",4 "towards generalization simplicity continuous control. work shows policies simple linear rbf parameterizations trained solve variety continuous control tasks, including openai gym benchmarks. performance trained policies competitive state art results, obtained elaborate parameterizations fully connected neural networks. furthermore, existing training testing scenarios shown limited prone over-fitting, thus giving rise trajectory-centric policies. training diverse initial state distribution shown produce global policies better generalization. allows interactive control scenarios system recovers large on-line perturbations; shown supplementary video.",4 "thermal visible synthesis face images using multiple regions. synthesis visible spectrum faces thermal facial imagery promising approach heterogeneous face recognition; enabling existing face recognition software trained visible imagery leveraged, allowing human analysts verify cross-spectrum matches effectively. propose new synthesis method enhance discriminative quality synthesized visible face imagery leveraging global (e.g., entire face) local regions (e.g., eyes, nose, mouth). here, region provides (1) independent representation corresponding area, (2) additional regularization terms, impact overall quality synthesized images. analyze effects using multiple regions synthesize visible face image thermal face. demonstrate approach improves cross-spectrum verification rates recently published synthesis approaches. moreover, using synthesized imagery, report results facial landmark detection-commonly used image registration-which critical part face recognition process.",4 "attention attention: architectures visual question answering (vqa). visual question answering (vqa) increasingly popular topic deep learning research, requiring coordination natural language processing computer vision modules single architecture. build upon model placed first vqa challenge developing thirteen new attention mechanisms introducing simplified classifier. performed 300 gpu hours extensive hyperparameter architecture searches able achieve evaluation score 64.78%, outperforming existing state-of-the-art single model's validation score 63.15%.",4 "robust method vote aggregation proposition verification invariant local features. paper presents method analysis vote space created local features extraction process multi-detection system. method opposed classic clustering approach gives high level control clusters composition verification steps. proposed method comprises graphical vote space presentation, proposition generation, two-pass iterative vote aggregation cascade filters verification propositions. cascade filters contain minor algorithms needed effective object detection verification. new approach drawbacks classic clustering approaches gives substantial control process detection. method exhibits exceptionally high detection rate conjunction low false detection chance comparison alternative methods.",4 "positive definite kernels machine learning. survey introduction positive definite kernels set methods inspired machine learning literature, namely kernel methods. first discuss properties positive definite kernels well reproducing kernel hibert spaces, natural extension set functions $\{k(x,\cdot),x\in\mathcal{x}\}$ associated kernel $k$ defined space $\mathcal{x}$. discuss length construction kernel functions take advantage well-known statistical models. provide overview numerous data-analysis methods take advantage reproducing kernel hilbert spaces discuss idea combining several kernels improve performance certain tasks. also provide short cookbook different kernels particularly useful certain data-types images, graphs speech segments.",19 "comparing neural attractiveness-based visual features artwork recommendation. advances image processing computer vision latest years brought use visual features artwork recommendation. recent works shown visual features obtained pre-trained deep neural networks (dnns) perform well recommending digital art. recent works shown explicit visual features (evf) based attractiveness perform well preference prediction tasks, previous work compared dnn features versus specific attractiveness-based visual features (e.g. brightness, texture) terms recommendation performance. work, study compare performance dnn evf features purpose physical artwork recommendation using transactional data ugallery, online store physical paintings. addition, perform exploratory analysis understand dnn embedded features relation certain evf. results show dnn features outperform evf, certain evf features suited physical artwork recommendation and, finally, show evidence certain neurons dnn might partially encoding visual features brightness, providing opportunity explaining recommendations based visual neural models.",4 "learning visualizing localized geometric features using 3d-cnn: application manufacturability analysis drilled holes. 3d convolutional neural networks (3d-cnn) used object recognition based voxelized shape object. however, interpreting decision making process 3d-cnns still infeasible task. paper, present unique 3d-cnn based gradient-weighted class activation mapping method (3d-gradcam) visual explanations distinct local geometric features interest within object. enable efficient learning 3d geometries, augment voxel data surface normals object boundary. train 3d-cnn augmented data identify local features critical decision-making using 3d gradcam. application feature identification framework recognize difficult-to-manufacture drilled hole features complex cad geometry. framework extended identify difficult-to-manufacture features multiple spatial scales leading real-time design manufacturability decision support system.",19 "twitter hash tag recommendation. rise popularity microblogging services like twitter led increased use content annotation strategies like hashtag. hashtags provide users tagging mechanism help organize, group, create visibility posts. simple idea challenging user practice leads infrequent usage. paper, investigate various methods recommending hashtags new posts created encourage widespread adoption usage. hashtag recommendation comes numerous challenges including processing huge volumes streaming data content small noisy. investigate preprocessing methods reduce noise data determine effective method hashtag recommendation based popular classification algorithms.",4 "deep active learning named entity recognition. deep learning yielded state-of-the-art performance many natural language processing tasks including named entity recognition (ner). however, typically requires large amounts labeled data. work, demonstrate amount labeled training data drastically reduced deep learning combined active learning. active learning sample-efficient, computationally expensive since requires iterative retraining. speed up, introduce lightweight architecture ner, viz., cnn-cnn-lstm model consisting convolutional character word encoders long short term memory (lstm) tag decoder. model achieves nearly state-of-the-art performance standard datasets task computationally much efficient best performing models. carry incremental active learning, training process, able nearly match state-of-the-art performance 25\% original training data.",4 "stochastic neural networks hierarchical reinforcement learning. deep reinforcement learning achieved many impressive results recent years. however, tasks sparse rewards long horizons continue pose significant challenges. tackle important problems, propose general framework first learns useful skills pre-training environment, leverages acquired skills learning faster downstream tasks. approach brings together strengths intrinsic motivation hierarchical methods: learning useful skill guided single proxy reward, design requires minimal domain knowledge downstream tasks. high-level policy trained top skills, providing significant improvement exploration allowing tackle sparse rewards downstream tasks. efficiently pre-train large span skills, use stochastic neural networks combined information-theoretic regularizer. experiments show combination effective learning wide span interpretable skills sample-efficient way, significantly boost learning performance uniformly across wide range downstream tasks.",4 "successive nonnegative projection algorithm robust nonnegative blind source separation. paper, propose new fast robust recursive algorithm near-separable nonnegative matrix factorization, particular nonnegative blind source separation problem. algorithm, refer successive nonnegative projection algorithm (snpa), closely related popular successive projection algorithm (spa), takes advantage nonnegativity constraint decomposition. prove snpa robust spa applied broader class nonnegative matrices. illustrated synthetic data sets, real-world hyperspectral image.",19 "binary schema computational algorithms process vowel-based euphonic conjunctions word searches. comprehensively searching words sanskrit e-text non-trivial problem words could change forms different contexts. one context sandhi euphonic conjunctions, cause word change owing presence adjacent letters words. change wrought possible conjunctions significant sanskrit simple search word given form alone significantly reduce success level search. work presents representational schema represents letters binary format reduces paninian rules euphonic conjunctions simple bit set-unset operations. work presents efficient algorithm process vowel-based sandhis using schema. presents another algorithm uses sandhi processor generate possible transformed word forms given word use comprehensive word search.",4 "robust dictionary based data representation. robustness noise outliers important issue linear representation real applications. focus problem samples grossly corrupted, also 'sample specific' corruptions problem. reasonable assumption corrupted samples cannot represented dictionary clean samples well represented. assumption enforced paper investigating coefficients corrupted samples. concretely, require coefficients corrupted samples zero. way, representation quality clean data assured without effect corrupted data. last, robust dictionary based data representation approach sparse representation version proposed, directive significance future applications.",4 "echo state queueing network: new reservoir computing learning tool. last decade, new computational paradigm introduced field machine learning, name reservoir computing (rc). rc models neural networks recurrent part (the reservoir) participate learning process, rest system recurrence (no neural circuit) occurs. approach grown rapidly due success solving learning tasks computational applications. success also observed another recently proposed neural network designed using queueing theory, random neural network (randnn). approaches good properties identified drawbacks. paper, propose new rc model called echo state queueing network (esqn), use ideas coming randnns design reservoir. esqns consist esns reservoir new dynamics inspired recurrent randnns. paper positions esqns global machine learning area, provides examples use performances. show largely used benchmarks esqns accurate tools, illustrate compare standard esns.",4 "multi-task averaging. present multi-task learning approach jointly estimate means multiple independent data sets. proposed multi-task averaging (mta) algorithm results convex combination single-task maximum likelihood estimates. derive optimal minimum risk estimator minimax estimator, show estimators efficiently estimated. simulations real data experiments demonstrate mta estimators often outperform single-task james-stein estimators.",19 "panoptic studio: massively multiview system social interaction capture. present approach capture 3d motion group people engaged social interaction. core challenges capturing social interactions are: (1) occlusion functional frequent; (2) subtle motion needs measured space large enough host social group; (3) human appearance configuration variation immense; (4) attaching markers body may prime nature interactions. panoptic studio system organized around thesis social interactions measured integration perceptual analyses large variety view points. present modularized system designed around principle, consisting integrated structural, hardware, software innovations. system takes, input, 480 synchronized video streams multiple people engaged social activities, produces, output, labeled time-varying 3d structure anatomical landmarks individuals space. algorithm designed fuse ""weak"" perceptual processes large number views progressively generating skeletal proposals low-level appearance cues, framework temporal refinement also presented associating body parts reconstructed dense 3d trajectory stream. system method first reconstructing full body motion five people engaged social interactions without using markers. also empirically demonstrate impact number views achieving goal.",4 "estimating individual treatment effect observational data using random forest methods. estimation individual treatment effect observational data complicated due challenges confounding selection bias. useful inferential framework address counterfactual (potential outcomes) model takes hypothetical stance asking individual received treatments. making use random forests (rf) within counterfactual framework estimate individual treatment effects directly modeling response. find accurate estimation individual treatment effects possible even complex heterogeneous settings type rf approach plays important role accuracy. methods designed adaptive confounding, used parallel out-of-sample estimation, best. one method found especially promising counterfactual synthetic forests. illustrate new methodology applying large comparative effectiveness trial, project aware, order explore role drug use plays sexual risk. analysis reveals important connections risky behavior, drug usage, sexual risk.",19 "direct uncertainty estimation reinforcement learning. optimal probabilistic approach reinforcement learning computationally infeasible. simplification consisting neglecting difference true environment model estimated using limited number observations causes exploration vs exploitation problem. uncertainty expressed terms probability distribution space environment models, uncertainty propagated action-value function via bellman iterations, computationally insufficiently efficient though. consider possibility directly measuring uncertainty action-value function, analyze sufficiency facilitated approach.",4 "performance localisation. performance becomes issue particularly execution cost hinders functionality program. typically profiler used find program code execution represents large portion overall execution cost program. pinpointing performance issue exists provides starting point tracing cause back program. profiling shows performance issue manifests, use mutation analysis show performance improvement likely exist. find mutation analysis indicate locations within program highly impactful overall execution cost program yet executed relatively infrequently. better locating potential performance improvements programs hope make performance improvement amenable automation.",4 "linear-time algorithm bayesian image denoising based gaussian markov random field. paper, consider bayesian image denoising based gaussian markov random field (gmrf) model, propose new algorithm. method solve bayesian image denoising problems, including hyperparameter estimation, $o(n)$-time, $n$ number pixels given image. perspective order computational time, state-of-the-art algorithm present problem setting. moreover, results numerical experiments show method fact effective practice.",19 "approach reducing annotation costs bionlp. broad range bionlp tasks active learning (al) significantly reduce annotation costs specific al algorithm developed particularly effective reducing annotation costs tasks. previously developed al algorithm called closestinitpa works best tasks following characteristics: redundancy training material, burdensome annotation costs, support vector machines (svms) work well task, imbalanced datasets (i.e. set binary classification problem, one class substantially rarer other). many bionlp tasks characteristics thus al algorithm natural approach apply bionlp tasks.",4 "context-dependent fine-grained entity type tagging. entity type tagging task assigning category labels mention entity document. standard systems focus small set types, recent work (ling weld, 2012) suggests using large fine-grained label set lead dramatic improvements downstream tasks. absence labeled training data, existing fine-grained tagging systems obtain examples automatically, using resolved entities types extracted knowledge base. however, since appropriate type often depends context (e.g. washington could tagged either city government), procedure result spurious labels, leading poorer generalization. propose task context-dependent fine type tagging, set acceptable labels mention restricted deducible local context (e.g. sentence document). introduce new resources task: 12,017 mentions annotated context-dependent fine types, provide baseline experimental results data.",4 "camera pose filtering local regression geodesics riemannian manifold dual quaternions. time-varying, smooth trajectory estimation great interest vision community accurate well behaving 3d systems. paper, propose novel principal component local regression filter acting directly riemannian manifold unit dual quaternions $\mathbb{d} \mathbb{h}_1$. use numerically stable lie algebra dual quaternions together $\exp$ $\log$ operators locally linearize 6d pose space. unlike state art path smoothing methods either operate $so\left(3\right)$ rotation matrices hypersphere $\mathbb{h}_1$ quaternions, treat orientation translation jointly dual quaternion quadric 7-dimensional real projective space $\mathbb{r}\mathbb{p}^7$. provide outlier-robust irls algorithm generic pose filtering exploiting manifold structure. besides theoretical analysis, experiments synthetic real data show practical advantages manifold aware filtering pose tracking smoothing.",4 "empirical model acknowledgment spoken-language systems. refine extend prior views description, purposes, contexts-of-use acknowledgment acts empirical examination use acknowledgments task-based conversation. distinguish three broad classes acknowledgments (other-->ackn, self-->other-->ackn, self+ackn) present catalogue 13 patterns within classes account specific uses acknowledgment corpus.",2 "generative adversarial nets multiple text corpora. generative adversarial nets (gans) successfully applied artificial generation image data. terms text data, much done artificial generation natural language single corpus. consider multiple text corpora input data, two applications gans: (1) creation consistent cross-corpus word embeddings given different word embeddings per corpus; (2) generation robust bag-of-words document embeddings corpora. demonstrate gan models real-world text data sets different corpora, show embeddings models lead improvements supervised learning problems.",4 plummer autoencoders. estimating true density high-dimensional feature spaces well-known problem machine learning. work shows possible formulate optimization problem minimization use representational power neural networks learn complex densities. theoretical bound estimation error given dealing finite number samples. proposed theory corroborated extensive experiments different datasets compared several existing approaches families generative adversarial networks autoencoder-based models.,4 "regularization approach blind deblurring denoising qr barcodes. qr bar codes prototypical images part image priori known (required patterns). open source bar code readers, zbar, readily available. exploit facts provide assess purely regularization-based methods blind deblurring qr bar codes presence noise.",4 "multi-scale mining fmri data hierarchical structured sparsity. inverse inference, ""brain reading"", recent paradigm analyzing functional magnetic resonance imaging (fmri) data, based pattern recognition statistical learning. predicting cognitive variables related brain activation maps, approach aims decoding brain activity. inverse inference takes account multivariate information voxels currently way assess precisely cognitive information encoded activity neural populations within whole brain. however, relies prediction function plagued curse dimensionality, since far features samples, i.e., voxels fmri volumes. address problem, different methods proposed, as, among others, univariate feature selection, feature agglomeration regularization techniques. paper, consider sparse hierarchical structured regularization. specifically, penalization use constructed tree obtained spatially-constrained agglomerative clustering. approach encodes spatial structure data different scales regularization, makes overall prediction procedure robust inter-subject variability. regularization used induces selection spatially coherent predictive brain regions simultaneously different scales. test algorithm real data acquired study mental representation objects, show proposed algorithm delineates meaningful brain regions yields well better prediction accuracy reference methods.",19 "interactive multiclass segmentation using superpixel classification. paper adresses problem interactive multiclass segmentation. propose fast efficient new interactive segmentation method called superpixel classification-based interactive segmentation (scis). strokes drawn human user image, method extracts relevant semantic objects. get fast calculation accurate segmentation, scis uses superpixel over-segmentation support vector machine classification. paper, demonstrate scis significantly outperfoms competing algorithms evaluating performances reference benchmarks mcguinness santner.",4 "deep matching prior network: toward tighter multi-oriented text detection. detecting incidental scene text challenging task multi-orientation, perspective distortion, variation text size, color scale. retrospective research focused using rectangular bounding box horizontal sliding window localize text, may result redundant background noise, unnecessary overlap even information loss. address issues, propose new convolutional neural networks (cnns) based method, named deep matching prior network (dmpnet), detect text tighter quadrangle. first, use quadrilateral sliding windows several specific intermediate convolutional layers roughly recall text higher overlapping area shared monte-carlo method proposed fast accurate computing polygonal areas. that, designed sequential protocol relative regression exactly predict text compact quadrangle. moreover, auxiliary smooth ln loss also proposed regressing position text, better overall performance l2 loss smooth l1 loss terms robustness stability. effectiveness approach evaluated public word-level, multi-oriented scene text database, icdar 2015 robust reading competition challenge 4 ""incidental scene text localization"". performance method evaluated using f-measure found 70.64%, outperforming existing state-of-the-art method f-measure 63.76%.",4 "learning document image binarization data. paper present fully trainable binarization solution degraded document images. unlike previous attempts often used simple features series pre- post-processing, solution encodes heuristics whether pixel foreground text high-dimensional feature vector learns complicated decision function. particular, prepare features three types: 1) existing features binarization intensity [1], contrast [2], [3], laplacian [4], [5]; 2) reformulated features existing binarization decision functions [6] [7]; 3) newly developed features, namely logarithm intensity percentile (lip) relative darkness index (rdi). initial experimental results show using selected samples (about 1.5% available training data), achieve binarization performance comparable fine-tuned (typically hand), state-of-the-art methods. additionally, trained document binarization classifier shows good generalization capabilities out-of-domain data.",4 "extracting bilingual persian italian lexicon comparable corpora using different types seed dictionaries. bilingual dictionaries important various fields natural language processing. recent years, research extracting new bilingual lexicons non-parallel (comparable) corpora proposed. almost use small existing dictionary resource make initial list called ""seed dictionary"". paper discuss use different types dictionaries initial starting list creating bilingual persian-italian lexicon comparable corpus. experiments apply state-of-the-art techniques three different seed dictionaries; existing dictionary, dictionary created pivot-based schema, dictionary extracted small persian-italian parallel text. interesting challenge approach find way combine different dictionaries together order produce better accurate lexicon. order combine seed dictionaries, propose two different combination models examine effect novel combination models various comparable corpora differing degrees comparability. conclude proposal new weighting system improve extracted lexicon. experimental results produced implementation show efficiency proposed models.",4 "evidential reasoning parallel hierarchical vision programs. paper presents efficient adaptation application dempster-shafer theory evidence, one used effectively massively parallel hierarchical system visual pattern perception. describes techniques used, shows extended example serve improve system's performance applies multiple-level set processes.",4 "qualitative shape representation based qualitative relative direction distance calculus eopram. document serves brief technical report, detailing processes used represent reconstruct simplified polygons using qualitative spatial descriptions, defined eopram qualitative spatial calculus.",4 "go deep wide learning?. achieve acceptable performance ai tasks, one either use sophisticated feature extraction methods first layer two-layered supervised learning model, learn features directly using deep (multi-layered) model. first approach problem-specific, second approach computational overheads learning multiple layers fine-tuning model. paper, propose approach called wide learning based arc-cosine kernels, learns single layer infinite width. propose exact inexact learning strategies wide learning show wide learning single layer outperforms single layer well deep architectures finite width benchmark datasets.",4 "fastmask: segment multi-scale object candidates one shot. objects appear scale differently natural images. fact requires methods dealing object-centric tasks (e.g. object proposal) robust performance variances object scales. paper, present novel segment proposal framework, namely fastmask, takes advantage hierarchical features deep convolutional neural networks segment multi-scale objects one shot. innovatively, adapt segment proposal network three different functional components (body, neck head). propose weight-shared residual neck module well scale-tolerant attentional head module efficient one-shot inference. ms coco benchmark, proposed fastmask outperforms state-of-the-art segment proposal methods average recall 2~5 times faster. moreover, slight trade-off accuracy, fastmask segment objects near real time (~13 fps) 800*600 resolution images, demonstrating potential practical applications. implementation available https://github.com/voidrank/fastmask.",4 "deep learning reverse photon migration diffuse optical tomography. artificial intelligence (ai) learn complicated non-linear physics? propose novel deep learning approach learns non-linear photon scattering physics obtains accurate 3d distribution optical anomalies. contrast traditional black-box deep learning approaches inverse problems, deep network learns invert lippmann-schwinger integral equation describes essential physics photon migration diffuse near-infrared (nir) photons turbid media. example clinical relevance, applied method prototype diffuse optical tomography (dot). show deep neural network, trained simulation data, accurately recover location anomalies within biomimetic phantoms live animals without use exogenous contrast agent.",4 "bridging gap reinforcement learning knowledge representation: logical off- on-policy framework. knowledge representation important issue reinforcement learning. paper, bridge gap reinforcement learning knowledge representation, providing rich knowledge representation framework, based normal logic programs answer set semantics, capable solving model-free reinforcement learning problems complex do-mains exploits domain-specific knowledge. prove correctness approach. show complexity finding offline online policy model-free reinforcement learning problem approach np-complete. moreover, show model-free reinforcement learning problem mdp environment encoded sat problem. importance model-free reinforcement",4 "dirichlet process mixed random measures: nonparametric topic model labeled data. describe nonparametric topic model labeled data. model uses mixture random measures (mrm) base distribution dirichlet process (dp) hdp framework, call dp-mrm. model labeled data, define dp distributed random measure label, resulting model generates unbounded number topics label. apply dp-mrm single-labeled multi-labeled corpora documents compare performance label prediction medlda, lda-svm, labeled-lda. enhance model incorporating ddcrp modeling multi-labeled images image segmentation object labeling, comparing performance ncuts rddcrp.",4 "robust distributed online prediction. standard model online prediction deals serial processing inputs single processor. however, large-scale online prediction problems, inputs arrive high rate, increasingly common necessity distribute computation across several processors. non-trivial challenge design distributed algorithms online prediction, maintain good regret guarantees. \cite{dmb}, presented dmb algorithm, generic framework convert serial gradient-based online prediction algorithm distributed algorithm. moreover, regret guarantee asymptotically optimal smooth convex loss functions stochastic inputs. flip side, fragile many types failures common distributed environments. companion paper, present variants dmb algorithm, resilient many types network failures, tolerant varying performance computing nodes.",4 "inference minimizing size, divergence, sum. speed marginal inference ignoring factors significantly contribute overall accuracy. order pick suitable subset factors ignore, propose three schemes: minimizing number model factors bound kl divergence pruned full models; minimizing kl divergence bound factor count; minimizing weighted sum kl divergence factor count. three problems solved using approximation kl divergence calculated terms marginals computed simple seed graph. applied synthetic image denoising three different types nlp parsing models, technique performs marginal inference 11 times faster loopy bp, graph sizes reduced 98%-at comparable error marginals parsing accuracy. also show minimizing weighted sum divergence size substantially faster minimizing either objectives based approximation divergence presented here.",4 "refining source representations relation networks neural machine translation. although neural machine translation (nmt) encoder-decoder framework achieved great success recent times, still suffers drawbacks: rnns tend forget old information often useful encoder operates words without considering word relationship. solve problems, introduce relation networks (rn) nmt refine encoding representations source. method, rn first augments representation source word neighbors reasons possible pairwise relations them. source representations relations fed attention module decoder together, keeping main encoder-decoder architecture unchanged. experiments two chinese-to-english data sets different scales show method outperform competitive baselines significantly.",4 "narrative science systems: review. automatic narration events entities need hour, especially live reporting critical volume information narrated huge. paper discusses challenges context, along algorithms used build systems. systematic study, infer work done area related statistical data. also found subjective evaluation contribution experts also limited narration context.",4 "tensor regression networks various low-rank tensor approximations. tensor regression networks achieve high rate compression model parameters multilayer perceptrons (mlp) slight impact performances. tensor regression layer imposes low-rank constraints tensor regression layer replaces flattening operation traditional mlp. investigate tensor regression networks using various low-rank tensor approximations, aiming leverage multi-modal structure high dimensional data enforcing efficient low-rank constraints. provide theoretical analysis giving insights choice rank parameters. evaluated performance proposed model state-of-the-art deep convolutional models. cifar-10 dataset, achieved compression rate 0.018 sacrifice accuracy less 1%.",4 "detection tracking liquids fully convolutional networks. recent advances ai robotics claimed many incredible results deep learning, yet work date applied deep learning problem liquid perception reasoning. paper, apply fully-convolutional deep neural networks tasks detecting tracking liquids. evaluate three models: single-frame network, multi-frame network, lstm recurrent network. results show best liquid detection results achieved aggregating data multiple frames, contrast standard image segmentation. also show lstm network outperforms two tasks. suggests lstm-based neural networks potential key component enabling robots handle liquids using robust, closed-loop controllers.",4 "slugbot: application novel scalable open domain socialbot framework. paper introduce novel, open domain socialbot amazon alexa prize competition, aimed carrying friendly conversations users variety topics. present modular system, highlighting different data sources use human mind model data management. additionally build employ natural language understanding information retrieval tools apis expand knowledge bases. describe semistructured, scalable framework crafting topic-specific dialogue flows, give details dialogue management schemes scoring mechanisms. finally briefly evaluate performance system observe challenges open domain socialbot faces.",4 "deep representation learning part loss person re-identification. learning discriminative representations unseen person images critical person re-identification (reid). current approaches learn deep representations classification tasks, essentially minimize empirical classification risk training set. shown experiments, representations commonly focus several body parts discriminative training set, rather entire human body. inspired structural risk minimization principle svm, revise traditional deep representation learning procedure minimize empirical classification risk representation learning risk. representation learning risk evaluated proposed part loss, automatically generates several parts image, computes person classification loss part separately. compared traditional global classification loss, simultaneously considering multiple part loss enforces deep network focus entire human body learn discriminative representations different parts. experimental results three datasets, i.e., market1501, cuhk03, viper, show representation outperforms existing deep representations.",4 "bridge simulation metric estimation landmark manifolds. present inference algorithm connected monte carlo based estimation procedures metric estimation landmark configurations distributed according transition distribution riemannian brownian motion arising large deformation diffeomorphic metric mapping (lddmm) metric. distribution possesses properties similar regular euclidean normal distribution transition density governed high-dimensional pde closed-form solution nonlinear case. show density numerically approximated monte carlo sampling conditioned brownian bridges, use estimate parameters lddmm kernel thus metric structure maximum likelihood.",4 "deep ehr: survey recent advances deep learning techniques electronic health record (ehr) analysis. past decade seen explosion amount digital information stored electronic health records (ehr). primarily designed archiving patient clinical information administrative healthcare tasks, many researchers found secondary use records various clinical informatics tasks. period, machine learning community seen widespread advances deep learning techniques, also successfully applied vast amount ehr data. paper, review deep ehr systems, examining architectures, technical aspects, clinical applications. also identify shortcomings current techniques discuss avenues future research ehr-based deep learning.",4 "feature based approach video compression. high cost problem panoramic image stitching via image matching algorithm practical real-time performance. paper, take full advantage ofharris corner invariant characterization method light intensity parallel meaning, translation rotation, made realtime panoramic image stitching algorithm. according basic characteristics performance fpga classical algorithm, several modules feature point extraction, matching description optimize feature-based logic. real-time optimization system achieve high precision match. new algorithm process image pixel domain obtained ccd camera xilinx spartan-6 hardware platform. image stitching algorithm, eventually form portable interface output high-definition content display. results showed that, proposed algorithm higher precision good real-time performance robustness.",4 solving goddard problem influence diagram. influence diagrams decision-theoretic extension probabilistic graphical models. paper show used solve goddard problem. present results numerical experiments problem compare solutions provided influence diagrams optimal solution.,4 "study cuckoo optimization algorithm production planning problem. constrained nonlinear programming problems hard problems, one widely used common problems production planning problem optimize. study, one mathematical models production planning survey problem solved cuckoo algorithm. cuckoo algorithm efficient method solve continues non linear problem. moreover, mentioned models production planning solved genetic algorithm lingo software results compared. cuckoo algorithm suitable choice optimization convergence solution",12 "urban legends go viral?. urban legends genre modern folklore, consisting stories rare exceptional events, plausible enough believed, tend propagate inexorably across communities. view, urban legends represent form ""sticky"" deceptive text, marked tension credible incredible. credible like news article incredible like fairy tale go viral. particular focus idea urban legends mimic details news (who, where, when) credible, emotional readable like fairy tale catchy memorable. using nlp tools provide quantitative analysis prototypical characteristics. also lay machine learning experiments showing possible recognize urban legend using simple features.",4 "predicting privileged information height estimation. paper, propose novel regression-based method employing privileged information estimate height using human metrology. actual values anthropometric measurements difficult estimate accurately using state-of-the-art computer vision algorithms. hence, use ratios anthropometric measurements features. since many anthropometric measurements available test time real-life scenarios, employ learning using privileged information (lupi) framework regression setup. instead using lupi paradigm regression original form (i.e., \epsilon-svr+), train regression models predict privileged information test time. predictions used, along observable features, perform height estimation. height estimated, mapping classes performed. demonstrate proposed approach estimate height better faster \epsilon-svr+ algorithm report results different genders quartiles humans.",4 fitness-based adaptive control parameters genetic programming: adaptive value setting mutation rate flood mechanisms. paper concerns applications genetic algorithms genetic programming tasks difficult find representation map highly complex discontinuous fitness landscape. cases standard algorithm prone getting trapped local extremes. paper proposes several adaptive mechanisms useful preventing search getting trapped.,4 "multi-view metric learning multi-view video summarization. traditional methods video summarization designed generate summaries single-view video records; thus cannot fully exploit redundancy multi-view video records. paper, present multi-view metric learning framework multi-view video summarization combines advantages maximum margin clustering disagreement minimization criterion. learning framework thus ability find metric best separates data, meanwhile force learned metric maintain original intrinsic information data points, example geometric information. facilitated framework, systematic solution multi-view video summarization problem developed. best knowledge, first time address multi-view video summarization viewpoint metric learning. effectiveness proposed method demonstrated experiments.",4 "transductive zero-shot action recognition word-vector embedding. number categories action recognition growing rapidly become increasingly hard label sufficient training data learning conventional models categories. instead collecting ever data labelling exhaustively categories, attractive alternative approach zero-shot learning"" (zsl). end, study construct mapping visual features semantic descriptor action category, allowing new categories recognised absence visual training data. existing zsl studies focus primarily still images, attribute-based semantic representations. work, explore word-vectors shared semantic space embed videos category labels zsl action recognition. challenging problem existing zsl still images and/or attributes, mapping video spacetime features actions semantic space complex harder learn purpose generalising cross-category domain shift. solve generalisation problem zsl action recognition, investigate series synergistic strategies improve upon standard zsl pipeline. strategies transductive nature means access testing data training phase.",4 "variational bi-lstms. recurrent neural networks like long short-term memory (lstm) important architectures sequential prediction tasks. lstms (and rnns general) model sequences along forward time direction. bidirectional lstms (bi-lstms) hand model sequences along forward backward directions generally known perform better tasks capture richer representation data. training bi-lstms, forward backward paths learned independently. propose variant bi-lstm architecture, call variational bi-lstm, creates channel two paths (during training, may omitted inference); thus optimizing two paths jointly. arrive joint objective model minimizing variational lower bound joint likelihood data sequence. model acts regularizer encourages two networks inform making respective predictions using distinct information. perform ablation studies better understand different components model evaluate method various benchmarks, showing state-of-the-art performance.",19 "rand-walk: latent variable model approach word embeddings. semantic word embeddings represent meaning word via vector, created diverse methods. many use nonlinear operations co-occurrence statistics, hand-tuned hyperparameters reweighting methods. paper proposes new generative model, dynamic version log-linear topic model of~\citet{mnih2007three}. methodological novelty use prior compute closed form expressions word statistics. provides theoretical justification nonlinear models like pmi, word2vec, glove, well hyperparameter choices. also helps explain low-dimensional semantic embeddings contain linear algebraic structure allows solution word analogies, shown by~\citet{mikolov2013efficient} many subsequent papers. experimental support provided generative model assumptions, important latent word vectors fairly uniformly dispersed space.",4 "hypergraph-partitioned vertex programming approach large-scale consensus optimization. modern data science problems, techniques extracting value big data require performing large-scale optimization heterogenous, irregularly structured data. much data best represented multi-relational graphs, making vertex programming abstractions pregel graphlab ideal fits modern large-scale data analysis. paper, describe vertex-programming implementation popular consensus optimization technique known alternating direction multipliers (admm). admm consensus optimization allows elegant solution complex objectives inference rich probabilistic models. also introduce novel hypergraph partitioning technique improves state-of-the-art partitioning techniques vertex programming significantly reduces communication cost reducing number replicated nodes order magnitude. implemented algorithm graphlab measure scaling performance variety realistic bipartite graph distributions large synthetic voter-opinion analysis application. experiments, able achieve 50% improvement runtime current state-of-the-art graphlab partitioning scheme.",4 "sparsey: event recognition via deep hierarchical spare distributed codes. visual cortex's hierarchical, multi-level organization captured many biologically inspired computational vision models, general idea progressively larger scale, complex spatiotemporal features represented progressively higher areas. however, earlier models use localist representations (codes) representational field, equate cortical macrocolumn (mac), level. localism, represented feature/event (item) coded single unit. model, sparsey, also hierarchical crucially, uses sparse distributed coding (sdc) every mac levels. sdc, represented item coded small subset mac's units. sdcs different items overlap size overlap items represent similarity. difference localism sdc crucial sdc allows two essential operations associative memory, storing new item retrieving best-matching stored item, done fixed time life model. since model's core algorithm, storage retrieval (inference), makes single pass macs time step, overall model's storage/retrieval operation also fixed-time, criterion consider essential scalability huge datasets. 2010 paper described nonhierarchical version model context purely spatial pattern processing. here, elaborate fully hierarchical model (arbitrary numbers levels macs per level), describing novel model principles like progressive critical periods, dynamic modulation principal cells' activation functions based mac-level familiarity measure, representation multiple simultaneously active hypotheses, novel method time warp invariant recognition, report results showing learning/recognition spatiotemporal patterns.",16 "iot endpoint system-on-chip secure energy-efficient near-sensor analytics. near-sensor data analytics promising direction iot endpoints, minimizes energy spent communication reduces network load - also poses security concerns, valuable data stored sent network various stages analytics pipeline. using encryption protect sensitive data boundary on-chip analytics engine way address data security issues. cope combined workload analytics encryption tight power envelope, propose fulmine, system-on-chip based tightly-coupled multi-core cluster augmented specialized blocks compute-intensive data processing encryption functions, supporting software programmability regular computing tasks. fulmine soc, fabricated 65nm technology, consumes less 20mw average 0.8v achieving efficiency 70pj/b encryption, 50pj/px convolution, 25mips/mw software. strong argument real-life flexible application platform, show experimental results three secure analytics use cases: secure autonomous aerial surveillance state-of-the-art deep cnn consuming 3.16pj per equivalent risc op; local cnn-based face detection secured remote recognition 5.74pj/op; seizure detection encrypted data collection eeg within 12.7pj/op.",4 "parameter estimation softmax decision-making models linear objective functions. eye towards human-centered automation, contribute development systematic means infer features human decision-making behavioral data. motivated common use softmax selection models human decision-making, study maximum likelihood parameter estimation problem softmax decision-making models linear objective functions. present conditions likelihood function convex. allow us provide sufficient conditions convergence resulting maximum likelihood estimator construct asymptotic distribution. case models nonlinear objective functions, show estimator applied linearizing nominal parameter value. apply estimator fit stochastic ucl (upper credible limit) model human decision-making human subject data. show statistically significant differences behavior across related, distinct, tasks.",12 "knowledge common knowledge distributed environment. reasoning knowledge seems play fundamental role distributed systems. indeed, reasoning central part informal intuitive arguments used design distributed protocols. communication distributed system viewed act transforming system's state knowledge. paper presents general framework formalizing reasoning knowledge distributed systems. argue states knowledge groups processors useful concepts design analysis distributed protocols. particular, distributed knowledge corresponds knowledge ``distributed'' among members group, common knowledge corresponds fact ``publicly known''. relationship common knowledge variety desirable actions distributed system illustrated. furthermore, shown that, formally speaking, practical systems common knowledge cannot attained. number weaker variants common knowledge attainable many cases interest introduced investigated.",4 "multimodal convolutional neural networks matching image sentence. paper, propose multimodal convolutional neural networks (m-cnns) matching image sentence. m-cnn provides end-to-end framework convolutional architectures exploit image representation, word composition, matching relations two modalities. specifically, consists one image cnn encoding image content, one matching cnn learning joint representation image sentence. matching cnn composes words different semantic fragments learns inter-modal relations image composed fragments different levels, thus fully exploit matching relations image sentence. experimental results benchmark databases bidirectional image sentence retrieval demonstrate proposed m-cnns effectively capture information necessary image sentence matching. specifically, proposed m-cnns bidirectional image sentence retrieval flickr30k microsoft coco databases achieve state-of-the-art performances.",4 graph-based denoising time-varying point clouds. noisy 3d point clouds arise many applications. may due errors constructing 3d model images simply imprecise depth sensors. point clouds given geometrical structure using graphs created similarity information points. paper introduces technique uses graph structure convex optimization methods denoise 3d point clouds. short discussion presents methods naturally generalize time-varying inputs 3d point cloud time series.,4 "continuation semantics multi-quantifier sentences: operation-based approaches. classical scope-assignment strategies multi-quantifier sentences involve quantifier phrase (qp)-movement. recent continuation-based approaches provide compelling alternative, interpret qp's situ - without resorting logical forms structures beyond overt syntax. continuation-based strategies divided two groups: locate source scope-ambiguity rules semantic composition attribute lexical entries quantifier words. paper, focus former operation-based approaches nature semantic operations involved. specifically, discuss three possible operation-based strategies multi-quantifier sentences, together relative merits costs.",12 "short-term memory persistent activity: evolution self-stopping self-sustaining activity spiking neural networks. memories brain separated two categories: short-term long-term memories. long-term memories remain lifetime, short-term ones exist milliseconds minutes. within short-term memory studies, debate neural structure could implement it. indeed, mechanisms responsible long-term memories appear inadequate task. instead, proposed short-term memories could sustained persistent activity group neurons. work, explore topology could sustain short-term memories, designing model specific hypotheses, darwinian evolution order obtain new insights implementation. evolved 10 networks capable retaining information fixed duration 2 11s. main finding evolution naturally created two functional modules network: one sustains information containing primarily excitatory neurons, other, responsible forgetting, composed mainly inhibitory neurons. demonstrates balance inhibition excitation plays important role cognition.",4 algorithmic stability hypothesis complexity. introduce notion algorithmic stability learning algorithms---that term \emph{argument stability}---that captures stability hypothesis output learning algorithm normed space functions hypotheses selected. main result paper bounds generalization error learning algorithm terms argument stability. bounds based martingale inequalities banach space hypotheses belong. apply general bounds bound performance learning algorithms based empirical risk minimization stochastic gradient descent.,19 "parameterized approach personalized variable length summarization soccer matches. present parameterized approach produce personalized variable length summaries soccer matches. approach based temporally segmenting soccer video 'plays', associating user-specifiable 'utility' type play using 'bin-packing' select subset plays add desired length maximizing overall utility (volume bin-packing terms). approach systematically allows user override default weights assigned type play individual preferences thus see highly personalized variable length summarization soccer matches. demonstrate approach based output end-to-end pipeline building produce summaries. though aspects overall end-to-end pipeline human assisted present, results clearly show proposed approach capable producing semantically meaningful compelling summaries. besides obvious use producing summaries superior league matches news broadcasts, anticipate work promote greater awareness local matches junior leagues producing consumable summaries them.",4 "vegac: visual saliency-based age, gender, facial expression classification using convolutional neural networks. paper explores use visual saliency classify age, gender facial expression facial images. multi-task classification, propose method vegac, based visual saliency. using deep multi-level network [1] off-the-shelf face detector [2], proposed method first detects face test image extracts cnn predictions cropped face. cnn vegac fine-tuned collected dataset different benchmarks. convolutional neural network (cnn) uses vgg-16 architecture [3] pre-trained imagenet image classification. demonstrate usefulness method age estimation, gender classification, facial expression classification. show obtain competitive result method selected benchmarks. models code publically available.",4 "modified splice extension non-stereo data noise robust speech recognition. paper, modification training process popular splice algorithm proposed noise robust speech recognition. modification based feature correlations, enables stereo-based algorithm improve performance noise conditions, especially unseen cases. further, modified framework extended work non-stereo datasets clean noisy training utterances, stereo counterparts, required. finally, mllr-based computationally efficient run-time noise adaptation method splice framework proposed. modified splice shows 8.6% absolute improvement splice test c aurora-2 database, 2.93% overall. non-stereo method shows 10.37% 6.93% absolute improvements aurora-2 aurora-4 baseline models respectively. run-time adaptation shows 9.89% absolute improvement modified framework compared splice test c, 4.96% overall w.r.t. standard mllr adaptation hmms.",4 "voice conversion unaligned corpora using variational autoencoding wasserstein generative adversarial networks. building voice conversion (vc) system non-parallel speech corpora challenging highly valuable real application scenarios. situations, source target speakers repeat texts may even speak different languages. case, one possible, although indirect, solution build generative model speech. generative models focus explaining observations latent variables instead learning pairwise transformation function, thereby bypassing requirement speech frame alignment. paper, propose non-parallel vc framework variational autoencoding wasserstein generative adversarial network (vaw-gan) explicitly considers vc objective building speech model. experimental results corroborate capability framework building vc system unaligned data, demonstrate improved conversion quality.",4 "optimal query complexity reconstructing hypergraphs. paper consider problem reconstructing hidden weighted hypergraph constant rank using additive queries. prove following: let $g$ weighted hidden hypergraph constant rank n vertices $m$ hyperedges. $m$ exists non-adaptive algorithm finds edges graph weights using $$ o(\frac{m\log n}{\log m}) $$ additive queries. solves open problem [s. choi, j. h. kim. optimal query complexity bounds finding graphs. {\em stoc}, 749--758,~2008]. weights hypergraph integers less $o(poly(n^d/m))$ $d$ rank hypergraph (and therefore unweighted hypergraphs) exists non-adaptive algorithm finds edges graph weights using $$ o(\frac{m\log \frac{n^d}{m}}{\log m}). $$ additive queries. using information theoretic bound query complexities tight.",4 "pyramidal gradient matching optical flow estimation. initializing optical flow field either sparse descriptor matching dense patch matches proved particularly useful capturing large displacements. paper, present pyramidal gradient matching approach provide dense matches highly accurate efficient optical flow estimation. novel contribution method image gradient used describe image patches proved able produce robust matching. therefore, method efficient methods adopt special features (like sift) patch distance metric. moreover, find image gradient scalable optical flow estimation, means use different levels gradient feature (for example, full gradients direction information gradients) obtain different complexity without dramatic changes accuracy. another contribution uncover secrets limited patchmatch thorough analysis design pyramidal matching framework based secrets. pyramidal matching framework aimed robust gradient matching effective grow inliers reject outliers. framework, present special enhancements outlier filtering gradient matching. initializing epicflow matches, experimental results show method efficient robust (ranking 1st clean pass final pass mpi sintel dataset among published methods).",4 "multi-label pixelwise classification reconstruction large-scale urban areas. object classification one many holy grails computer vision resulted large number algorithms proposed already. specifically recent years considerable progress area primarily due increased efficiency accessibility deep learning techniques. fact, single-label object classification [i.e. one object present image] state-of-the-art techniques employ deep neural networks reporting close human-like performance. specialized applications single-label object-level classification suffice; example cases image contains multiple intertwined objects different labels. paper, address complex problem multi-label pixelwise classification. present distinct solution based convolutional neural network (cnn) performing multi-label pixelwise classification application large-scale urban reconstruction. supervised learning approach followed training 13-layer cnn using lidar satellite images. empirical study conducted determine hyperparameters result optimal performance cnn. scale invariance introduced training network five different scales input labeled data. results six pixelwise classifications different scale. svm trained map six pixelwise classifications single-label. lastly, refine boundary pixel labels using graph-cuts maximum a-posteriori (map) estimation markov random field (mrf) priors. resulting pixelwise classification used accurately extract reconstruct buildings large-scale urban areas. proposed approach extensively tested results reported.",4 "probabilistic linear genetic programming stochastic context-free grammar solving symbolic regression problems. traditional linear genetic programming (lgp) algorithms based selection mechanism guide search. genetic operators combine mutate random portions individuals, without knowing result lead fitter individual. probabilistic model building genetic programming (pmb-gp) methods proposed overcome issue probability model captures structure fit individuals use sample new individuals. work proposes use lgp stochastic context-free grammar (scfg), probability distribution updated according selected individuals. proposed method adapting grammar linear representation lgp. tests performed proposed probabilistic method, two hybrid approaches, several symbolic regression benchmark problems show results statistically better obtained traditional lgp.",4 "deep convolutional neural network using directional wavelets low-dose x-ray ct reconstruction. due potential risk inducing cancers, radiation dose x-ray ct reduced routine patient scanning. however, low-dose x-ray ct, severe artifacts usually occur due photon starvation, beamhardening, etc, decrease reliability diagnosis. thus, high quality reconstruction low-dose x-ray ct data become one important research topics ct community. conventional model-based denoising approaches are, however, computationally expensive, image domain denoising approaches hardly deal ct specific noise patterns. address issues, propose algorithm using deep convolutional neural network (cnn), applied wavelet transform coefficients low-dose ct images. specifically, using directional wavelet transform extracting directional component artifacts exploiting intra- inter-band correlations, deep network effectively suppress ct specific noises. moreover, cnn designed various types residual learning architecture faster network training better denoising. experimental results confirm proposed algorithm effectively removes complex noise patterns ct images, originated reduced x-ray dose. addition, show wavelet domain cnn efficient removing noises low-dose ct compared image domain cnn. results rigorously evaluated several radiologists second place award 2016 aapm low-dose ct grand challenge. best knowledge, work first deep learning architecture low-dose ct reconstruction rigorously evaluated proven efficacy.",4 "probabilistic dimensionality reduction via structure learning. propose novel probabilistic dimensionality reduction framework naturally integrate generative model locality information data. based framework, present new model, able learn smooth skeleton embedding points low-dimensional space high-dimensional noisy data. formulation new model equivalently interpreted two coupled learning problem, i.e., structure learning learning projection matrix. interpretation motivates learning embedding points directly form explicit graph structure. develop new method learn embedding points form spanning tree, extended obtain discriminative compact feature representation clustering problems. unlike traditional clustering methods, assume centers clusters close connected learned graph, cluster centers distant. greatly facilitate data visualization scientific discovery downstream analysis. extensive experiments performed demonstrate proposed framework able obtain discriminative feature representations, correctly recover intrinsic structures various real-world datasets.",19 "large-scale music annotation retrieval: learning rank joint semantic spaces. music prediction tasks range predicting tags given song clip audio, predicting name artist, predicting related songs given song, clip, artist name tag. is, interested every semantic relationship different musical concepts database. realistically sized databases, number songs measured hundreds thousands more, number artists tens thousands more, providing considerable challenge standard machine learning techniques. work, propose method scales datasets attempts capture semantic similarities database items modeling audio, artist names, tags single low-dimensional semantic space. choice space learnt optimizing set prediction tasks interest jointly using multi-task learning. method outperforms baseline methods and, comparison them, faster consumes less memory. demonstrate method learns interpretable model, semantic space captures well similarities interest.",4 "note alternating minimization algorithm matrix completion problem. consider problem reconstructing low rank matrix subset entries analyze two variants so-called alternating minimization algorithm, proposed past. establish underlying matrix rank $r=1$, positive bounded entries, graph $\mathcal{g}$ underlying revealed entries bounded degree diameter logarithmic size matrix, algorithms succeed reconstructing matrix approximately polynomial time starting arbitrary initialization. provide simulation results suggest second algorithm based message passing type updates, performs significantly better.",19 "deep learning isotropic super-resolution non-isotropic 3d electron microscopy. sophisticated existing methods generate 3d isotropic super-resolution (sr) non-isotropic electron microscopy (em) based learned dictionaries. unfortunately, none existing methods generate practically satisfying results. 2d natural images, recently developed super-resolution methods use deep learning shown significantly outperform previous state art. adapted one successful architectures (fsrcnn) 3d super-resolution, compared performance 3d u-net architecture used previously generate super-resolution. trained architectures artificially downscaled isotropic ground truth focused ion beam milling scanning em (fib-sem) tested performance various hyperparameter settings. results indicate architectures successfully generate 3d isotropic super-resolution non-isotropic em, u-net performing consistently better. propose several promising directions practical application.",4 "intrusions marked renewal processes. present probabilistic model intrusion marked renewal process. given process sequence events, intrusion subsequence events produced process. applications model are, example, online payment fraud fraudster taking user's account performing payments user's behalf, unexpected equipment failures due unintended use. adopt bayesian approach infer probability intrusion sequence events, map subsequence events constituting intrusion, marginal probability event sequence belong intrusion. evaluate model intrusion detection synthetic data, well anonymized data online payment system.",4 "evidence size principle semantic perceptual domains. shepard's universal law generalization offered compelling case first physics-like law cognitive science hold intelligent agents universe. shepard's account based rational bayesian model generalization, providing answer question law emerge. extending account explain humans use multiple examples make better generalizations requires additional assumption, called size principle: hypotheses pick fewer objects make larger contribution generalization. degree principle warrants similarly law-like status far conclusive. typically, evaluating principle straightforward, requiring additional assumptions. present new method evaluating size principle direct, apply method diverse array datasets. results provide support broad applicability size principle.",4 "aerial spectral super-resolution using conditional adversarial networks. inferring spectral signatures ground based natural images acquired lot interest applied deep learning. contrast spectra ground based images, aerial spectral images low spatial resolution suffer higher noise interference. paper, train conditional adversarial network learn inverse mapping trichromatic space 31 spectral bands within 400 700 nm. network trained aerocampus, first kind aerial hyperspectral dataset. aerocampus consists high spatial resolution color images low spatial resolution hyperspectral images (hsi). color images synthesized 31 spectral bands used train network. baseline root mean square error 2.48 synthesized rgb test data, show possible generate spectral signatures aerial imagery.",4 "gpu-based image analysis mobile devices. rapid advances mobile technology many mobile devices capable capturing high quality images video embedded camera. paper investigates techniques real-time processing resulting images, particularly on-device utilizing graphical processing unit. issues limitations image processing mobile devices discussed, performance graphical processing units range devices measured programmable shader implementation canny edge detection.",4 "binary excess risk smooth convex surrogates. statistical learning theory, convex surrogates 0-1 loss highly preferred computational theoretical virtues convexity brings in. importance consider smooth surrogates witnessed fact smoothness beneficial computationally- attaining {\it optimal} convergence rate optimization, statistical sense- providing improved {\it optimistic} rate generalization bound. paper investigate smoothness property viewpoint statistical consistency show affects binary excess risk. show contrast optimization generalization errors favor choice smooth surrogate loss, smoothness loss function may degrade binary excess risk. motivated negative result, provide unified analysis integrates optimization error, generalization bound, error translating convex excess risk binary excess risk examining impact smoothness binary excess risk. show favorable conditions appropriate choice smooth convex loss result binary excess risk better $o(1/\sqrt{n})$.",4 "sparse image representation epitomes. sparse coding, decomposition vector using basis elements, widely used machine learning image processing. basis set, also called dictionary, learned adapt specific data. approach proven effective many image processing tasks. traditionally, dictionary unstructured ""flat"" set atoms. paper, study structured dictionaries obtained epitome, set epitomes. epitome small image, atoms patches chosen size inside image. considerably reduces number parameters learn provides sparse image decompositions shiftinvariance properties. propose new formulation algorithm learning structured dictionaries associated epitomes, illustrate use image denoising tasks.",4 "autoperf: generalized zero-positive learning system detect software performance anomalies. paper, present autoperf, generalized software performance anomaly detection system. autoperf uses autoencoders, unsupervised learning technique, hardware performance counters learn performance signatures parallel programs. uses knowledge identify newer versions program suffer performance penalties, simultaneously providing root cause analysis help programmers debug program's performance. autoperf first zero-positive learning performance anomaly detector, system trains entirely negative (non-anomalous) space learn positive (anomalous) behaviors. demonstrate autoperf's generality three different types performance anomalies: (i) true sharing cache contention, (ii) false sharing cache contention, (iii) numa latencies across 15 real world performance anomalies 7 open source programs. autoperf 3.7% profiling overhead (on average) detects anomalies prior state-of-the-art approach.",4 "supervised ibp: neighbourhood preserving infinite latent feature models. propose probabilistic model infer supervised latent variables hamming space observed data. model allows simultaneous inference number binary latent variables, values. latent variables preserve neighbourhood structure data sense objects semantic concept similar latent values, objects different concepts dissimilar latent values. formulate supervised infinite latent variable problem based intuitive principle pulling objects together type, pushing apart not. combine principle flexible indian buffet process prior latent variables. show inferred supervised latent variables directly used perform nearest neighbour search purpose retrieval. introduce new application dynamically extending hash codes, show effectively couple structure hash codes continuously growing structure neighbourhood preserving infinite latent feature space.",4 "non-distributional word vector representations. data-driven representation learning words technique central importance nlp. indisputably useful source features downstream tasks, vectors tend consist uninterpretable components whose relationship categories traditional lexical semantic theories tenuous best. present method constructing interpretable word vectors hand-crafted linguistic resources like wordnet, framenet etc. vectors binary (i.e, contain 0 1) 99.9% sparse. analyze performance state-of-the-art evaluation methods distributional models word vectors find competitive standard distributional approaches.",4 "syntactic structures code parameters. assign binary ternary error-correcting codes data syntactic structures world languages study distribution code points space code parameters. show that, codes populate lower region approximating superposition thomae functions, substantial presence codes gilbert-varshamov bound even asymptotic bound plotkin bound. investigate dynamics induced space code parameters spin glass models language change, show that, presence entailment relations syntactic parameters dynamics sometimes improve code. large sets languages syntactic data, one gain information spin glass dynamics induced dynamics space code parameters.",4 "comparison three methods clustering: k-means, spectral clustering hierarchical clustering. comparison three kind clustering find cost function loss function calculate them. error rate clustering methods calculate error percentage always one important factor evaluating clustering methods, paper introduce one way calculate error rate clustering methods. clustering algorithms divided several categories including partitioning clustering algorithms, hierarchical algorithms density based algorithms. generally speaking compare clustering algorithms scalability, ability work different attribute, clusters formed conventional, minimal knowledge computer recognize input parameters, classes dealing noise extra deposition error rate clustering new data, thus, effect input data, different dimensions high levels, k-means one simplest approach clustering clustering unsupervised problem.",4 "understanding deep neural networks rectified linear units. paper investigate family functions representable deep neural networks (dnn) rectified linear units (relu). give algorithm train relu dnn one hidden layer *global optimality* runtime polynomial data size albeit exponential input dimension. further, improve known lower bounds size (from exponential super exponential) approximating relu deep net function shallower relu net. gap theorems hold smoothly parametrized families ""hard"" functions, contrary countable, discrete families known literature. example consequence gap theorems following: every natural number $k$ exists function representable relu dnn $k^2$ hidden layers total size $k^3$, relu dnn $k$ hidden layers require least $\frac{1}{2}k^{k+1}-1$ total nodes. finally, family $\mathbb{r}^n\to \mathbb{r}$ dnns relu activations, show new lowerbound number affine pieces, larger previous constructions certain regimes network architecture distinctively lowerbound demonstrated explicit construction *smoothly parameterized* family functions attaining scaling. construction utilizes theory zonotopes polyhedral theory.",4 "batch, off-policy, actor-critic algorithm optimizing average reward. develop off-policy actor-critic algorithm learning optimal policy training set composed data multiple individuals. algorithm developed view towards use mobile health.",19 "multi-spectral image panchromatic sharpening, outcome process quality assessment protocol. multispectral (ms) image panchromatic (pan) sharpening algorithms proposed remote sensing community ever increasing number variety. aim sharpen coarse spatial resolution ms image fine spatial resolution pan image acquired simultaneously spaceborne airborne earth observation (eo) optical imaging sensor pair. unfortunately, date, standard evaluation procedure ms image pan sharpening outcome process community agreed upon, contrast quality assurance framework earth observation (qa4eo) guidelines proposed intergovernmental group earth observations (geo). general, process easier measure, outcome important. original contribution present study fourfold. first, existing procedures quantitative quality assessment (q2a) (sole) pan sharpened ms product critically reviewed. conceptual implementation drawbacks highlighted overcome quality improvement. second, novel (to best authors' knowledge, first) protocol q2a ms image pan sharpening product process designed, implemented validated independent means. third, within protocol, innovative categorization spectral spatial image quality indicators metrics presented. fourth, according new taxonomy, original third order isotropic multi scale gray level co occurrence matrix (tims glcm) calculator tims glcm texture feature extractor proposed replace popular second order glcms.",4 "review evaluation techniques social dialogue systems. contrast goal-oriented dialogue, social dialogue clear measure task success. consequently, evaluation systems notoriously hard. paper, review current evaluation methods, focusing automatic metrics. conclude turn-based metrics often ignore context account fact several replies valid, end-of-dialogue rewards mainly hand-crafted. lack grounding human perceptions.",4 "segan: speech enhancement generative adversarial network. current speech enhancement techniques operate spectral domain and/or exploit higher-level feature. majority tackle limited number noise conditions rely first-order statistics. circumvent issues, deep networks increasingly used, thanks ability learn complex functions large example sets. work, propose use generative adversarial networks speech enhancement. contrast current techniques, operate waveform level, training model end-to-end, incorporate 28 speakers 40 different noise conditions model, model parameters shared across them. evaluate proposed model using independent, unseen test set two speakers 20 alternative noise conditions. enhanced samples confirm viability proposed model, objective subjective evaluations confirm effectiveness it. that, open exploration generative architectures speech enhancement, may progressively incorporate speech-centric design choices improve performance.",4 "screen content image segmentation using sparse-smooth decomposition. sparse decomposition extensively used different applications including signal compression denoising document analysis. paper, sparse decomposition used image segmentation. proposed algorithm separates background foreground using sparse-smooth decomposition technique smooth sparse components correspond background foreground respectively. algorithm tested several test images hevc test sequences shown superior performance methods, hierarchical k-means clustering djvu. segmentation algorithm also used text extraction, video compression medical image segmentation.",4 "causal models complete axiomatic characterization. markov networks bayesian networks effective graphic representations dependencies embedded probabilistic models. well known independencies captured markov networks (called graph-isomorphs) finite axiomatic characterization. paper, however, shows independencies captured bayesian networks (called causal models) axiomatization using even countably many horn disjunctive clauses. sub-independency model causal model may causal, graph-isomorphs closed sub-models.",4 "meta-learning phonemic annotation corpora. apply rule induction, classifier combination meta-learning (stacked classifiers) problem bootstrapping high accuracy automatic annotation corpora pronunciation information. task address paper consists generating phonemic representations reflecting flemish dutch pronunciations word basis orthographic representation (which turn based actual speech recordings). compare several possible approaches achieve text-to-pronunciation mapping task: memory-based learning, transformation-based learning, rule induction, maximum entropy modeling, combination classifiers stacked learning, stacking meta-learners. interested optimal accuracy obtaining insight linguistic regularities involved. far accuracy concerned, already high accuracy level (93% celex 86% fonilex word level) single classifiers boosted significantly additional error reductions 31% 38% respectively using combination classifiers, 5% using combination meta-learners, bringing overall word level accuracy 96% dutch variant 92% flemish variant. also show application machine learning methods indeed leads increased insight linguistic regularities determining variation two pronunciation variants studied.",4 "neuron pruning compressing deep networks using maxout architectures. paper presents efficient robust approach reducing size deep neural networks pruning entire neurons. exploits maxout units combining neurons complex convex functions makes use local relevance measurement ranks neurons according activation training set pruning them. additionally, parameter reduction comparison neuron weight pruning shown. empirically shown proposed neuron pruning reduces number parameters dramatically. evaluation performed two tasks, mnist handwritten digit recognition lfw face verification, using lenet-5 vgg16 network architecture. network size reduced $74\%$ $61\%$, respectively, without affecting network's performance. main advantage neuron pruning direct influence size network architecture. furthermore, shown neuron pruning combined subsequent weight pruning, reducing size lenet-5 vgg16 $92\%$ $80\%$ respectively.",4 "fragments count?. aim finding minimal set fragments achieves maximal parse accuracy data oriented parsing. experiments penn wall street journal treebank show counts almost arbitrary fragments within parse trees important, leading improved parse accuracy previous models tested treebank. isolate number dependency relations previous models neglect contribute higher parse accuracy.",4 "vision recognition using discriminant sparse optimization learning. better select correct training sample obtain robust representation query sample, paper proposes discriminant-based sparse optimization learning model. learning model integrates discriminant sparsity together. based model, propose classifier called locality-based discriminant sparse representation (ldsr). discriminant help increase difference samples different classes decrease difference samples within class, ldsr obtain better sparse coefficients constitute better sparse representation classification. order take advantages kernel techniques, discriminant sparsity, propose nonlinear classifier called kernel locality-based discriminant sparse representation (kldsr). experiments several well-known databases prove performance ldsr kldsr better several state-of-the-art methods including deep learning based methods.",4 "deephash: getting regularization, depth fine-tuning right. work focuses representing high-dimensional global image descriptors using compact 64-1024 bit binary hashes instance retrieval. propose deephash: hashing scheme based deep networks. key making deephash work extremely low bitrates three important considerations -- regularization, depth fine-tuning -- requiring solutions specific hashing problem. in-depth evaluation shows scheme consistently outperforms state-of-the-art methods across data sets fisher vectors deep convolutional neural network features, 20 percent schemes. retrieval performance 256-bit hashes close uncompressed floating point features -- remarkable 512 times compression.",4 "short review ethical challenges clinical natural language processing. clinical nlp immense potential contributing clinical practice revolutionized advent large scale processing clinical records. however, potential remained largely untapped due slow progress primarily caused strict data access policies researchers. paper, discuss concern privacy measures entails. also suggest sources less sensitive data. finally, draw attention biases compromise validity empirical research lead socially harmful applications.",4 "encoder-decoder shift-reduce syntactic parsing. starting nmt, encoder-decoder neu- ral networks used many nlp problems. graph-based models transition-based models borrowing en- coder components achieve state-of-the-art performance dependency parsing constituent parsing, respectively. how- ever, work empirically studying encoder-decoder neural net- works transition-based parsing. apply simple encoder-decoder end, achieving comparable results parser dyer et al. (2015) standard de- pendency parsing, outperforming parser vinyals et al. (2015) con- stituent parsing.",4 "diffusion convolutional recurrent neural network: data-driven traffic forecasting. spatiotemporal forecasting various applications neuroscience, climate transportation domain. traffic forecasting one canonical example learning task. task challenging due (1) complex spatial dependency road networks, (2) non-linear temporal dynamics changing road conditions (3) inherent difficulty long-term forecasting. address challenges, propose model traffic flow diffusion process directed graph introduce diffusion convolutional recurrent neural network (dcrnn), deep learning framework traffic forecasting incorporates spatial temporal dependency traffic flow. specifically, dcrnn captures spatial dependency using bidirectional random walks graph, temporal dependency using encoder-decoder architecture scheduled sampling. evaluate framework two real-world large scale road network traffic datasets observe consistent improvement 12% - 15% state-of-the-art baselines.",4 "chatpainter: improving text image generation using dialogue. synthesizing realistic images text descriptions dataset like microsoft common objects context (ms coco), image contain several objects, challenging task. prior work used text captions generate images. however, captions might informative enough capture entire image insufficient model able understand objects images correspond words captions. show adding dialogue describes scene leads significant improvement inception score quality generated images ms coco dataset.",4 "retrieval registration long-range overlapping frames scalable mosaicking vivo fetoscopy. purpose: standard clinical treatment twin-to-twin transfusion syndrome consists photo-coagulation undesired anastomoses located placenta responsible blood transfer two twins. standard care procedure, fetoscopy suffers limited field-of-view placenta resulting missed anastomoses. facilitate task clinician, building global map placenta providing larger overview vascular network highly desired. methods: overcome challenging visual conditions inherent vivo sequences (low contrast, obstructions presence artifacts, among others), propose following contributions: (i) robust pairwise registration achieved aligning orientation image gradients, (ii) difficulties regarding long-range consistency (e.g. due presence outliers) tackled via bag-of-word strategy, identifies overlapping frames sequence registered regardless respective location time. results: addition visual difficulties, vivo sequences characterised intrinsic absence gold standard. present mosaics motivating qualitatively methodological choices demonstrating promising aspect. also demonstrate semi-quantitatively, via visual inspection registration results, efficacy registration approach comparison two standard baselines. conclusion: paper proposes first approach construction mosaics placenta vivo fetoscopy sequences. robustness visual challenges registration long-range temporal consistency proposed, offering first positive results vivo data standard mosaicking techniques applicable.",4 "boosting trees anti-spam email filtering. paper describes set comparative experiments problem automatically filtering unwanted electronic mail messages. several variants adaboost algorithm confidence-rated predictions [schapire & singer, 99] applied, differ complexity base learners considered. two main conclusions drawn experiments: a) boosting-based methods clearly outperform baseline learning algorithms (naive bayes induction decision trees) pu1 corpus, achieving high levels f1 measure; b) increasing complexity base learners allows obtain better ``high-precision'' classifiers, important issue misclassification costs considered.",4 "relations fp-soft sets applied decision making problems. work, first define relations fuzzy parametrized soft sets study properties. also give decision making method based relations. approximate reasoning, relations fuzzy parametrized soft sets shown primordial importance. finally, method successfully applied problems contain uncertainties.",12 "arco1: application belief networks oil market. belief networks new, potentially important, class knowledge-based models. arco1, currently development atlantic richfield company (arco) university southern california (usc), advanced reported implementation models financial forecasting setting. arco1's underlying belief network models variables believed impact crude oil market. pictorial market model-developed mac ii- facilitates consensus among members forecasting team. system forecasts crude oil prices via monte carlo analyses network. several different models oil market developed; system's ability updated quickly highlights flexibility.",4 "intraoperative margin assessment human breast tissue optical coherence tomography images using deep neural networks. objective: work, perform margin assessment human breast tissue optical coherence tomography (oct) images using deep neural networks (dnns). work simulates intraoperative setting breast cancer lumpectomy. methods: train dnns, use state-of-the-art methods (weight decay dropout) newly introduced regularization method based function norms. commonly used methods fail small database available. use function norm introduces direct control complexity function aim diminishing risk overfitting. results: neither code data previous results publicly available, obtained results compared reported results literature conservative comparison. moreover, method applied locally collected data several data configurations. reported results average different trials. conclusion: experimental results show use dnns yields significantly better results techniques evaluated terms sensitivity, specificity, f1 score, g-mean matthews correlation coefficient. function norm regularization yielded higher robust results competing methods. significance: demonstrated system shows high promise (partially) automated margin assessment human breast tissue, equal error rate (eer) reduced approximately 12\% (the lowest reported literature) 5\%\,--\,a 58\% reduction. method computationally feasible intraoperative application (less 2 seconds per image).",19 "worst-case upper bound (1, 2)-qsat. rigorous theoretical analysis algorithm subclass qsat, i.e. (1, 2)-qsat, proposed literature. (1, 2)-qsat, first introduced sat'08, seen quantified extended 2-cnf formulas. now, within knowledge, exists algorithm presenting worst upper bound (1, 2)-qsat. therefore paper, present exact algorithm solve (1, 2)-qsat. analyzing algorithms, obtain worst-case upper bound o(1.4142m), number clauses.",4 "context driven label fusion segmentation subcutaneous visceral fat ct volumes. quantification adipose tissue (fat) computed tomography (ct) scans conducted mostly manual semi-automated image segmentation algorithms limited efficacy. work, propose completely unsupervised automatic method identify adipose tissue, separate subcutaneous adipose tissue (sat) visceral adipose tissue (vat) abdominal region. offer three-phase pipeline consisting (1) initial boundary estimation using gradient points, (2) boundary refinement using geometric median absolute deviation appearance based local outlier scores (3) context driven label fusion using conditional random fields (crf) obtain final boundary sat vat. evaluate proposed method 151 abdominal ct scans obtain state-of-the-art 94% 91% dice similarity scores sat vat segmentation, well significant reduction fat quantification error measure.",4 "integral curvature representation matching algorithms identification dolphins whales. address problem identifying individual cetaceans images showing trailing edge fins. given trailing edge unknown individual, produce ranking known individuals database. nicks notches along trailing edge define individual's unique signature. define representation based integral curvature robust changes viewpoint pose, captures pattern nicks notches local neighborhood multiple scales. explore two ranking methods use representation. first uses dynamic programming time-warping algorithm align two representations, interprets alignment cost measure similarity. algorithm also exploits learned spatial weights downweight matches regions unstable curvature. second interprets representation feature descriptor. feature keypoints defined local extrema representation. descriptors set known individuals stored tree structure, allows us perform queries given descriptors unknown trailing edge. evaluate top-k accuracy two real-world datasets demonstrate effectiveness curvature representation, achieving top-1 accuracy scores approximately 95% 80% bottlenose dolphins humpback whales, respectively.",4 historical dynamics lexical system random walk process. offered consider word meanings changes diachrony semicontinuous random walk reflecting swallowing screens. basic characteristics word life cycle defined. verification model realized data russian words distribution various age periods.,4 "learning using privileged information: svm+ weighted svm. prior knowledge used improve predictive performance learning algorithms reduce amount data required training. goal pursued within learning using privileged information paradigm recently introduced vapnik et al. aimed utilizing additional information available training time -- framework implemented svm+. relate privileged information importance weighting show prior knowledge expressible privileged features also encoded weights associated every training example. show weighted svm always replicate svm+ solution, converse true construct counterexample highlighting limitations svm+. finally, touch problem choosing weights weighted svms privileged features available.",19 "variable importance binary regression trees forests. characterize study variable importance (vimp) pairwise variable associations binary regression trees. key component involves node mean squared error quantity refer maximal subtree. theory naturally extends single trees ensembles trees applies methods like random forests. useful importance values random forests used screen variables, example used filter high throughput genomic data bioinformatics, little theory exists properties.",19 "contextual bandits latent confounders: nmf approach. motivated online recommendation advertising systems, consider causal model stochastic contextual bandits latent low-dimensional confounder. model, $l$ observed contexts $k$ arms bandit. observed context influences reward obtained latent confounder variable cardinality $m$ ($m \ll l,k$). arm choice latent confounder causally determines reward observed context correlated confounder. model, $l \times k$ mean reward matrix $\mathbf{u}$ (for context $[l]$ arm $[k]$) factorizes non-negative factors $\mathbf{a}$ ($l \times m$) $\mathbf{w}$ ($m \times k$). insight enables us propose $\epsilon$-greedy nmf-bandit algorithm designs sequence interventions (selecting specific arms), achieves balance learning low-dimensional structure selecting best arm minimize regret. algorithm achieves regret $\mathcal{o}\left(l\mathrm{poly}(m, \log k) \log \right)$ time $t$, compared $\mathcal{o}(lk\log t)$ conventional contextual bandits, assuming constant gap best arm rest context. guarantees obtained mild sufficiency conditions factors weaker versions well-known statistical rip condition. propose class generative models satisfy sufficient conditions, derive lower bound $\mathcal{o}\left(km\log t\right)$. first regret guarantees online matrix completion bandit feedback, rank greater one. compare performance algorithm state art, synthetic real world data-sets.",4 "improving image generative models human interactions. gans provide framework training generative models mimic data distribution. however, many cases wish train generative models optimize auxiliary objective function within data generates, making aesthetically pleasing images. cases, objective functions difficult evaluate, e.g. may require human interaction. here, develop system efficiently improving gan target objective involving human interaction, specifically generating images increase rates positive user interactions. improve generative model, build model human behavior targeted domain relatively small set interactions, use behavioral model auxiliary loss function improve generative model. show system successful improving positive interaction rates, least simulated data, characterize factors affect performance.",4 "transition-based dependency parsing pluggable classifiers. principle, design transition-based dependency parsers makes possible experiment general-purpose classifier without changes parsing algorithm. practice, however, often takes substantial software engineering bridge different representations used two software packages. present extensions maltparser allow drop-in use classifier conforming interface weka machine learning package, wrapper timbl memory-based learner interface, experiments multilingual dependency parsing variety classifiers. earlier work suggested memory-based learners might good choice low-resource parsing scenarios, cannot support hypothesis work. observed support-vector machines give better parsing performance memory-based learner, regardless size training set.",4 "extracting urban impervious surface gf-1 imagery using one-class classifiers. impervious surface area direct consequence urbanization, also plays important role urban planning environmental management. rapidly technical development remote sensing, monitoring urban impervious surface via high spatial resolution (hsr) images attracted unprecedented attention recently. traditional multi-classes models inefficient impervious surface extraction requires labeling needed unneeded classes occur image exhaustively. therefore, need find reliable one-class model classify one specific land cover type without labeling classes. study, investigate several one-class classifiers, presence background learning (pbl), positive unlabeled learning (pul), ocsvm, bsvm maxent, extract urban impervious surface area using high spatial resolution imagery gf-1, china's new generation high spatial remote sensing satellite, evaluate classification accuracy based artificial interpretation results. compared traditional multi-classes classifiers (ann svm), experimental results indicate pbl pul provide higher classification accuracy, similar accuracy provided ann model. meanwhile, pbl pul outperforms ocsvm, bsvm, maxent svm models. hence, one-class classifiers need small set specific samples train models without losing predictive accuracy, supposed gain attention urban impervious surface extraction one specific land cover type.",4 "optimal learning rates localized svms. one limiting factors using support vector machines (svms) large scale applications super-linear computational requirements terms number training samples. address issue, several approaches train svms many small chunks large data sets separately proposed literature. far, however, almost approaches empirically investigated. addition, motivation always based computational requirements. work, consider localized svm approach based upon partition input space. local svm, derive general oracle inequality. apply oracle inequality least squares regression using gaussian kernels deduce local learning rates essentially minimax optimal standard smoothness assumptions regression function. gives first motivation using local svms based computational requirements theoretical predictions generalization performance. introduce data-dependent parameter selection method local svm approach show method achieves learning rates before. finally, present larger scale experiments localized svm showing achieves essentially test performance global svm fraction computational requirements. addition, turns computational requirements local svms similar vanilla random chunk approach, achieved test errors significantly better.",19 "hierarchical approach joint multi-view object pose estimation categorization. propose joint object pose estimation categorization approach extracts information object poses categories object parts compositions constructed different layers hierarchical object representation algorithm, namely learned hierarchy parts (lhop). proposed approach, first employ lhop learn hierarchical part libraries represent entity parts compositions across different object categories views. then, extract statistical geometric features part realizations objects images order represent information object pose category different layer hierarchy. unlike traditional approaches consider specific layers hierarchies order extract information perform specific tasks, combine information extracted different layers solve joint object pose estimation categorization problem using distributed optimization algorithms. examine proposed generative-discriminative learning approach algorithms two benchmark 2-d multi-view image datasets. proposed approach algorithms outperform state-of-the-art classification, regression feature extraction algorithms. addition, experimental results shed light relationship object categorization, pose estimation part realizations observed different layers hierarchy.",4 "assessing threat adversarial examples deep neural networks. deep neural networks facing potential security threat adversarial examples, inputs look normal cause incorrect classification deep neural network. example, proposed threat could result hand-written digits scanned check incorrectly classified looking normal humans see them. research assesses extent adversarial examples pose security threat, one considers normal image acquisition process. process mimicked simulating transformations normally occur acquiring image real world application, using scanner acquire digits check amount using camera autonomous car. small transformations negate effect carefully crafted perturbations adversarial examples, resulting correct classification deep neural network. thus acquiring image decreases potential impact proposed security threat. also show already widely used process averaging multiple crops neutralizes adversarial examples. normal preprocessing, text binarization, almost completely neutralizes adversarial examples. first paper show text driven classification, adversarial examples academic curiosity, security threat.",4 "sample complexity end-to-end training vs. semantic abstraction training. compare end-to-end training approach modular approach system decomposed semantically meaningful components. focus sample complexity aspect, regime extremely high accuracy necessary, case autonomous driving applications. demonstrate cases number training examples required end-to-end approach exponentially larger number examples required semantic abstraction approach.",4 "empirical evaluation various deep learning architectures bi-sequence classification tasks. several tasks argumentation mining debating, question-answering, natural language inference involve classifying sequence context another sequence (referred bi-sequence classification). several single sequence classification tasks, current state-of-the-art approaches based recurrent convolutional neural networks. hand, bi-sequence classification problems, much understanding best deep learning architecture. paper, attempt get understanding category problems extensive empirical evaluation 19 different deep learning architectures (specifically different ways handling context) various problems originating natural language processing like debating, textual entailment question-answering. following empirical evaluation, offer insights conclusions regarding architectures considered. also establish first deep learning baselines three argumentation mining tasks.",4 "beyond word-based language model statistical machine translation. language model one important modules statistical machine translation currently word-based language model dominants community. however, many translation models (e.g. phrase-based models) generate target language sentences rendering compositing phrases rather words. thus, much reasonable model dependency phrases, research work succeed solving problem. paper, tackle problem designing novel phrase-based language model attempts solve three key sub-problems: 1, define phrase language model; 2, determine phrase boundary large-scale monolingual data order enlarge training set; 3, alleviate data sparsity problem due huge vocabulary size phrases. carefully handling issues, extensive experiments chinese-to-english translation show phrase-based language model significantly improve translation quality +1.47 absolute bleu score.",4 "batchwise monotone algorithm dictionary learning. propose batchwise monotone algorithm dictionary learning. unlike state-of-the-art dictionary learning algorithms impose sparsity constraints sample-by-sample basis, instead treat samples batch, impose sparsity constraint whole. benefit batchwise optimization non-zeros better allocated across samples, leading better approximation whole. accomplish this, propose procedures switch non-zeros rows columns support coefficient matrix reduce reconstruction error. prove proposed support switching procedure objective algorithm, i.e., reconstruction error, decreases monotonically converges. furthermore, introduce block orthogonal matching pursuit algorithm also operates sample batches provide warm start. experiments natural image patches uci data sets show proposed algorithm produces better approximation sparsity levels compared state-of-the-art algorithms.",4 "fuzzy soft rough k-means clustering approach gene expression data. clustering one widely used data mining techniques medical diagnosis. clustering considered important unsupervised learning technique. clustering methods group data based distance methods cluster data based similarity. clustering algorithms classify gene expression data clusters functionally related genes grouped together efficient manner. groupings constructed degree relationship strong among members cluster weak among members different clusters. work, focus similarity relationship among genes similar expression patterns consequential simple analytical decision made proposed fuzzy soft rough k-means algorithm. algorithm developed based fuzzy soft sets rough sets. comparative analysis proposed work made bench mark algorithms like k-means rough k-means efficiency proposed algorithm illustrated work using various cluster validity measures db index xie-beni index.",4 "large scale distributed semi-supervised learning using streaming approximation. traditional graph-based semi-supervised learning (ssl) approaches, even though widely applied, suited massive data large label scenarios since scale linearly number edges $|e|$ distinct labels $m$. deal large label size problem, recent works propose sketch-based methods approximate distribution labels per node thereby achieving space reduction $o(m)$ $o(\log m)$, certain conditions. paper, present novel streaming graph-based ssl approximation captures sparsity label distribution ensures algorithm propagates labels accurately, reduces space complexity per node $o(1)$. also provide distributed version algorithm scales well large data sizes. experiments real-world datasets demonstrate new method achieves better performance existing state-of-the-art algorithms significant reduction memory footprint. also study different graph construction mechanisms natural language applications propose robust graph augmentation strategy trained using state-of-the-art unsupervised deep learning architectures yields significant quality gains.",4 "active neural localization. localization problem estimating location autonomous agent observation map environment. traditional methods localization, filter belief based observations, sub-optimal number steps required, decide actions taken agent. propose ""active neural localizer"", fully differentiable neural network learns localize accurately efficiently. proposed model incorporates ideas traditional filtering-based localization methods, using structured belief state multiplicative interactions propagate belief, combines policy model localize accurately minimizing number steps required localization. active neural localizer trained end-to-end reinforcement learning. use variety simulation environments experiments include random 2d mazes, random mazes doom game engine photo-realistic environment unreal game engine. results 2d environments show effectiveness learned policy idealistic setting results 3d environments demonstrate model's capability learning policy perceptual model jointly raw-pixel based rgb observations. also show model trained random textures doom environment generalizes well photo-realistic office space environment unreal engine.",4 "adversarial discriminative domain adaptation. adversarial learning methods promising approach training robust deep networks, generate complex samples across diverse domains. also improve recognition despite presence domain shift dataset bias: several adversarial approaches unsupervised domain adaptation recently introduced, reduce difference training test domain distributions thus improve generalization performance. prior generative approaches show compelling visualizations, optimal discriminative tasks limited smaller shifts. prior discriminative approaches could handle larger domain shifts, imposed tied weights model exploit gan-based loss. first outline novel generalized framework adversarial adaptation, subsumes recent state-of-the-art approaches special cases, use generalized view better relate prior approaches. propose previously unexplored instance general framework combines discriminative modeling, untied weight sharing, gan loss, call adversarial discriminative domain adaptation (adda). show adda effective yet considerably simpler competing domain-adversarial methods, demonstrate promise approach exceeding state-of-the-art unsupervised adaptation results standard cross-domain digit classification tasks new difficult cross-modality object classification task.",4 "tensor sparse low-rank based submodule clustering method multi-way data. new submodule clustering method via sparse low-rank representation multi-way data proposed paper. instead reshaping multi-way data vectors, method maintains natural orders preserve data intrinsic structures, e.g., image data kept matrices. implement clustering, multi-way data, viewed tensors, represented proposed tensor sparse low-rank model obtain submodule representation, called free module, finally used spectral clustering. proposed method extends conventional subspace clustering method based sparse low-rank representation multi-way data submodule clustering combining t-product operator. new method tested several public datasets, including synthetical data, video sequences toy images. experiments show new method outperforms state-of-the-art methods, sparse subspace clustering (ssc), low-rank representation (lrr), ordered subspace clustering (osc), robust latent low rank representation (robustlatlrr) sparse submodule clustering method (ssmc).",4 "lifelong learning crf supervised aspect extraction. paper makes focused contribution supervised aspect extraction. shows system performed aspect extraction many past domains retained results knowledge, conditional random fields (crf) leverage knowledge lifelong learning manner extract new domain markedly better traditional crf without using prior knowledge. key innovation even crf training, model still improve extraction experiences applications.",4 "topic modeling public repositories scale using names source code. programming languages limited number reserved keywords character based tokens define language specification. however, programmers rich use natural language within code comments, text literals naming entities. programmer defined names found source code rich source information build high level understanding project. goal paper apply topic modeling names used 13.6 million repositories perceive inferred topics. one problems study occurrence duplicate repositories officially marked forks (obscure forks). show address using identifiers extracted topic modeling. open discussion naming source code, elaborate approach remove exact duplicate fuzzy duplicate repositories using locality sensitive hashing bag-of-words model discuss work topic modeling; finally present results data analysis together open-access source code, tools datasets.",4 "im2flow: motion hallucination static images action recognition. existing methods recognize actions static images take images face value, learning appearances---objects, scenes, body poses---that distinguish action class. however, models deprived rich dynamic structure motions also define human activity. propose approach hallucinates unobserved future motion implied single snapshot help static-image action recognition. key idea learn prior short-term dynamics thousands unlabeled videos, infer anticipated optical flow novel static images, train discriminative models exploit streams information. main contributions twofold. first, devise encoder-decoder convolutional neural network novel optical flow encoding translate static image accurate flow map. second, show power hallucinated flow recognition, successfully transferring learned motion standard two-stream network activity recognition. seven datasets, demonstrate power approach. achieves state-of-the-art accuracy dense optical flow prediction, also consistently enhances recognition actions dynamic scenes.",4 "automated playtesting procedural personas mcts evolved heuristics. paper describes method generative player modeling application automatic testing game content using archetypal player models called procedural personas. theoretically grounded psychological decision theory, procedural personas implemented using variation monte carlo tree search (mcts) node selection criteria developed using evolutionary computation, replacing standard ucb1 criterion mcts. using personas demonstrate generative player models applied varied corpus game levels demonstrate different play styles enacted level. short, use artificially intelligent personas construct synthetic playtesters. proposed approach could used tool automatic play testing human feedback readily available quick visualization potential interactions necessary. possible applications include interactive tools game development procedural content generation systems many evaluations must conducted within short time span.",4 "plausibility probability deductive reasoning. consider problem rational uncertainty unproven mathematical statements, g\""odel others remarked on. using bayesian-inspired arguments build normative model fair bets deductive uncertainty draws probability theory algorithms. comment connections zeilberger's notion ""semi-rigorous proofs"", particularly inherent subjectivity obstacle.",4 "robust fast decoding high-capacity color qr codes mobile applications. use color qr codes brings extra data capacity, also inflicts tremendous challenges decoding process due chromatic distortion, cross-channel color interference illumination variation. particularly, discover new type chromatic distortion high-density color qr codes, cross-module color interference, caused high density also makes geometric distortion correction challenging. address problems, propose two approaches, namely, lsvm-cmi qda-cmi, jointly model different types chromatic distortion. extended svm qda, respectively, lsvm-cmi qda-cmi optimize particular objective function learn color classifier. furthermore, robust geometric transformation method several pipeline refinements proposed boost decoding performance mobile applications. put forth implement framework high-capacity color qr codes equipped methods, called hiq. evaluate performance hiq, collect challenging large-scale color qr code dataset, cuhk-cqrc, consists 5390 high-density color qr code samples. comparison baseline method [2] cuhk-cqrc shows hiq least outperforms [2] 188% decoding success rate 60% bit error rate. implementation hiq ios android also demonstrates effectiveness framework real-world applications.",4 "hebbian/anti-hebbian neural network linear subspace learning: derivation multidimensional scaling streaming data. neural network models early sensory processing typically reduce dimensionality streaming input data. networks learn principal subspace, sense principal component analysis (pca), adjusting synaptic weights according activity-dependent learning rules. derived principled cost function rules nonlocal hence biologically implausible. time, biologically plausible local rules postulated rather derived principled cost function. here, bridge gap, derive biologically plausible network subspace learning streaming data minimizing principled cost function. departure previous work, cost quantified representation, reconstruction, error, adopt multidimensional scaling (mds) cost function streaming data. resulting algorithm relies biologically plausible hebbian anti-hebbian local learning rules. stochastic setting, synaptic weights converge stationary state projects input data onto principal subspace. data generated nonstationary distribution, network track principal subspace. thus, result makes step towards algorithmic theory neural computation.",16 "cuckoo search: recent advances applications. cuckoo search (cs) relatively new algorithm, developed yang deb 2009, cs efficient solving global optimization problems. paper, review fundamental ideas cuckoo search latest developments well applications. analyze algorithm gain insight search mechanisms find efficient. also discuss essence algorithms link self-organizing systems, finally propose important topics research.",12 "word sense disambiguation via high order learning complex networks. complex networks employed model many real systems modeling tool myriad applications. paper, use framework complex networks problem supervised classification word disambiguation task, consists deriving function supervised (or labeled) training data ambiguous words. traditional supervised data classification takes account topological physical features input data. hand, human (animal) brain performs low- high-level orders learning facility identify patterns according semantic meaning input data. paper, apply hybrid technique encompasses types learning field word sense disambiguation show high-level order learning really improve accuracy rate model. evidence serves demonstrate internal structures formed words present patterns that, generally, cannot correctly unveiled traditional techniques. finally, exhibit behavior model different weights low- high-level classifiers plotting decision boundaries. study helps one better understand effectiveness model.",15 "training evaluating multimodal word embeddings large-scale web annotated images. paper, focus training evaluating effective word embeddings text visual information. specifically, introduce large-scale dataset 300 million sentences describing 40 million images crawled downloaded publicly available pins (i.e. image sentence descriptions uploaded users) pinterest. dataset 200 times larger ms coco, standard large-scale image dataset sentence descriptions. addition, construct evaluation dataset directly assess effectiveness word embeddings terms finding semantically similar related words phrases. word/phrase pairs evaluation dataset collected click data millions users image search system, thus contain rich semantic relationships. based datasets, propose compare several recurrent neural networks (rnns) based multimodal (text image) models. experiments show model benefits incorporating visual information word embeddings, weight sharing strategy crucial learning multimodal embeddings. project page is: http://www.stat.ucla.edu/~junhua.mao/multimodal_embedding.html",4 "learning rgb-d salient object detection using background enclosure, depth contrast, top-down features. recently, deep convolutional neural networks (cnn) demonstrated strong performance rgb salient object detection. although, depth information help improve detection results, exploration cnns rgb-d salient object detection remains limited. propose novel deep cnn architecture rgb-d salient object detection exploits high-level, mid-level, low level features. further, present novel depth features capture ideas background enclosure depth contrast suitable learned approach. show improved results compared state-of-the-art rgb-d salient object detection methods. also show low-level mid-level depth features contribute improvements results. especially, f-score method 0.848 rgbd1000 dataset, 10.7% better second place.",4 "learning non-lambertian object intrinsics across shapenet categories. consider non-lambertian object intrinsic problem recovering diffuse albedo, shading, specular highlights single image object. build large-scale object intrinsics database based existing 3d models shapenet database. rendered realistic environment maps, millions synthetic images objects corresponding albedo, shading, specular ground-truth images used train encoder-decoder cnn. trained, network decompose image product albedo shading components, along additive specular component. cnn delivers accurate sharp results classical inverse problem computer vision, sharp details attributed skip layer connections corresponding resolutions encoder decoder. benchmarked shapenet mit intrinsics datasets, model consistently outperforms state-of-the-art large margin. train test cnn different object categories. perhaps surprising especially cnn classification perspective, intrinsics cnn generalizes well across categories. analysis shows feature learning encoder stage crucial developing universal representation across categories. apply synthetic data trained model images videos downloaded internet, observe robust realistic intrinsics results. quality non-lambertian intrinsics could open many interesting applications image-based albedo specular editing.",4 "hitting times local global optima genetic algorithms high selection pressure. paper devoted upper bounds expected first hitting times sets local global optima non-elitist genetic algorithms high selection pressure. results paper extend range situations upper bounds expected runtime known genetic algorithms apply, particular, canonical genetic algorithm. obtained bounds require probability fitness-decreasing mutation bounded constant less one.",4 "stable segmentation digital image. paper optimal image segmentation means piecewise constant approximations considered. optimality defined minimum value total squared error equivalent value standard deviation approximation image. optimal approximations defined independently method obtaining might generated different algorithms. investigate computation optimal approximation grounds stability respect given set modifications. obtain optimal approximation mumford-shuh model generalized developed, computational part combined otsu method multi-thresholding version. proposed solution proved analytically experimentally example standard image.",4 "visual-inertial-semantic scene representation 3-d object detection. describe system detect objects three-dimensional space using video inertial sensors (accelerometer gyrometer), ubiquitous modern mobile platforms phones drones. inertials afford ability impose class-specific scale priors objects, provide global orientation reference. minimal sufficient representation, posterior semantic (identity) syntactic (pose) attributes objects space, decomposed geometric term, maintained localization-and-mapping filter, likelihood function, approximated discriminatively-trained convolutional neural network. resulting system process video stream causally real time, provides representation objects scene persistent: confidence presence objects grows evidence, objects previously seen kept memory even temporarily occluded, return view automatically predicted prime re-detection.",4 "evolved policy gradients. propose meta-learning approach learning gradient-based reinforcement learning (rl) algorithms. idea evolve differentiable loss function, agent, optimizes policy minimize loss, achieve high rewards. loss parametrized via temporal convolutions agent's experience. loss highly flexible ability take account agent's history, enables fast task learning eliminates need reward shaping test time. empirical results show evolved policy gradient algorithm achieves faster learning several randomized environments compared off-the-shelf policy gradient method. moreover, test time, learner optimizes learned loss function, requires explicit reward signal. effect, agent internalizes reward structure, suggesting direction toward agents learn solve new tasks simply intrinsic motivation.",4 "novel variational model image registration using gaussian curvature. image registration one important task many image processing applications. aims align two images useful information extracted comparison, combination superposition. achieved constructing optimal trans- formation ensures template image becomes similar given reference image. although many models exist, designing model capable modelling large smooth deformation field continues pose challenge. paper proposes novel variational model image registration using gaussian curvature regulariser. model motivated surface restoration work geometric processing [elsey esedoglu, multiscale model. simul., (2009), pp. 1549-1573]. effective numerical solver provided model using augmented lagrangian method. numerical experiments show new model outperforms three competing models based on, respectively, linear curvature [fischer modersitzki, j. math. imaging vis., (2003), pp. 81- 85], mean curvature [chumchob, chen brito, multiscale model. simul., (2011), pp. 89-128] diffeomorphic demon model [vercauteren al., neuroimage, (2009), pp. 61-72] terms robustness accuracy.",12 "regret analysis continuous dueling bandit. dueling bandit learning framework wherein feedback information learning process restricted noisy comparison pair actions. research, address dueling bandit problem based cost function continuous space. propose stochastic mirror descent algorithm show algorithm achieves $o(\sqrt{t\log t})$-regret bound strong convexity smoothness assumptions cost function. subsequently, clarify equivalence regret minimization dueling bandit convex optimization cost function. moreover, considering lower bound convex optimization, algorithm shown achieve optimal convergence rate convex optimization optimal regret dueling bandit except logarithmic factor.",19 "genegan: learning object transfiguration attribute subspace unpaired data. object transfiguration replaces object image another object second image. example perform tasks like ""putting exactly eyeglasses image nose person image b"". usage exemplar images allows precise specification desired modifications improves diversity conditional image generation. however, previous methods rely feature space operations, require paired data and/or appearance models training disentangling objects background. work, propose model learn object transfiguration two unpaired sets images: one set containing images ""have"" kind object, set opposite, mild constraint objects located approximately place. example, training data one set reference face images eyeglasses, another set images not, spatially aligned face landmarks. despite weak 0/1 labels, model learn ""eyeglasses"" subspace contain multiple representatives different types glasses. consequently, perform fine-grained control generated images, like swapping glasses two images swapping projected components ""eyeglasses"" subspace, create novel images people wearing eyeglasses. overall, deterministic generative model learns disentangled attribute subspaces weakly labeled data adversarial training. experiments celeba multi-pie datasets validate effectiveness proposed model real world data, generating images specified eyeglasses, smiling, hair styles, lighting conditions etc. code available online.",4 "interactive data integration smart copy & paste. many scenarios, emergency response ad hoc collaboration, critical reduce overhead integrating data. ideally, one could perform entire process interactively one unified interface: defining extractors wrappers sources, creating mediated schema, adding schema mappings ? seeing impact integrated view data, refining design accordingly. propose novel smart copy paste (scp) model architecture seamlessly combining design-time run-time aspects data integration, describe initial prototype, copycat system. copycat, user need special tools different stages integration: instead, system watches user copies data applications (including web browser) pastes copycat?s spreadsheet-like workspace. copycat generalizes actions presents proposed auto-completions, explanation form provenance. user provides feedback suggestions ? either direct interactions copy-and-paste operations ? system learns feedback. paper provides overview prototype system, identifies key research challenges achieving scp full generality.",4 "entity-aware language model unsupervised reranker. language modeling, difficult incorporate entity relationships knowledge-base. one solution use reranker trained global features, global features derived n-best lists. however, training reranker requires manually annotated n-best lists, expensive obtain. propose method based contrastive estimation method~\cite{smith2005contrastive} alleviates need data. experiments music domain demonstrate global features, well features extracted external knowledge-base, incorporated reranker. final model achieves 0.44 absolute word error rate improvement blind test data.",4 "kernel alignment inspired linear discriminant analysis. kernel alignment measures degree similarity two kernels. paper, inspired kernel alignment, propose new linear discriminant analysis (lda) formulation, kernel alignment lda (kalda). first define two kernels, data kernel class indicator kernel. problem find subspace maximize alignment subspace-transformed data kernel class indicator kernel. surprisingly, kernel alignment induced kalda objective function similar classical lda expressed using between-class total scatter matrices. extended multi-label data. use stiefel-manifold gradient descent algorithm solve problem. perform experiments 8 single-label 6 multi-label data sets. results show kalda good performance many single-label multi-label problems.",4 "new characterizations minimum spanning trees saliency maps based quasi-flat zones. study three representations hierarchies partitions: dendrograms (direct representations), saliency maps, minimum spanning trees. provide new bijection saliency maps hierarchies based quasi-flat zones used image processing characterize saliency maps minimum spanning trees solutions constrained minimization problems constraint quasi-flat zones preservation. practice, results form toolkit new hierarchical methods one choose convenient representation. also invite us process non-image data morphological hierarchies.",4 "mining heterogeneous multivariate time-series learning meaningful patterns: application home health telecare. last years, time-series mining become challenging issue researchers. important application lies monitoring purposes, require analyzing large sets time-series learning usual patterns. deviation learned profile considered unexpected situation. moreover, complex applications may involve temporal study several heterogeneous parameters. paper, propose method mining heterogeneous multivariate time-series learning meaningful patterns. proposed approach allows mixed time-series -- containing pattern non-pattern data -- imprecise matches, outliers, stretching global translating patterns instances time. present early results approach context monitoring health status person home. purpose build behavioral profile person analyzing time variations several quantitative qualitative parameters recorded provision sensors installed home.",4 "hats: histograms averaged time surfaces robust event-based object classification. event-based cameras recently drawn attention computer vision community thanks advantages terms high temporal resolution, low power consumption high dynamic range, compared traditional frame-based cameras. properties make event-based cameras ideal choice autonomous vehicles, robot navigation uav vision, among others. however, accuracy event-based object classification algorithms, crucial importance reliable system working real-world conditions, still far behind frame-based counterparts. two main reasons performance gap are: 1. lack effective low-level representations architectures event-based object classification 2. absence large real-world event-based datasets. paper address problems. first, introduce novel event-based feature representation together new machine learning architecture. compared previous approaches, use local memory units efficiently leverage past temporal information build robust event-based representation. second, release first large real-world event-based dataset object classification. compare method state-of-the-art extensive experiments, showing better classification performance real-time computation.",4 "tractability theory patching. paper consider problem `theory patching', given domain theory, whose components indicated possibly flawed, set labeled training examples domain concept. theory patching problem revise indicated components theory, resulting theory correctly classifies training examples. theory patching thus type theory revision revisions made individual components theory. concern paper determine classes logical domain theories theory patching problem tractable. consider propositional first-order domain theories, show theory patching problem equivalent determining information contained theory `stable' regardless revisions might performed theory. show determining stability tractable input theory satisfies two conditions: revisions theory component monotonic effects classification examples, theory components act independently classification examples theory. also show concepts introduced used determine soundness completeness particular theory patching algorithms.",4 "early fire detection using hep space-time analysis. article, video base early fire alarm system developed monitoring smoke scene. two major contributions work. first, find best texture feature smoke detection, general framework, named histograms equivalent patterns (hep), adopted achieve extensive evaluation various kinds texture features. second, \emph{block based inter-frame difference} (bifd) improved version lbp-top proposed ensembled describe space-time characteristics smoke. order reduce false alarms, smoke history image (shi) utilized register recent classification results candidate smoke blocks. experimental results using svm show proposed method achieve better accuracy less false alarm compared state-of-the-art technologies.",4 "vain: attentional multi-agent predictive modeling. multi-agent predictive modeling essential step understanding physical, social team-play systems. recently, interaction networks (ins) proposed task modeling multi-agent physical systems, ins scale number interactions system (typically quadratic higher order number agents). paper introduce vain, novel attentional architecture multi-agent predictive modeling scales linearly number agents. show vain effective multi-agent predictive modeling. method evaluated tasks challenging multi-agent prediction domains: chess soccer, outperforms competing multi-agent approaches.",4 "pid parameters optimization using genetic algorithm. time delays components make time-lag systems response. arise physical, chemical, biological economic systems, well process measurement computation. work, implement genetic algorithm (ga) determining pid controller parameters compensate delay first order lag plus time delay (folpd) compare results iterative method ziegler-nichols rule results.",4 "attend, infer, repeat: fast scene understanding generative models. present framework efficient inference structured image models explicitly reason objects. achieve performing probabilistic inference using recurrent neural network attends scene elements processes one time. crucially, model learns choose appropriate number inference steps. use scheme learn perform inference partially specified 2d models (variable-sized variational auto-encoders) fully specified 3d models (probabilistic renderers). show models learn identify multiple objects - counting, locating classifying elements scene - without supervision, e.g., decomposing 3d images various numbers objects single forward pass neural network. show networks produce accurate inferences compared supervised counterparts, structure leads improved generalization.",4 "towards improving validation, verification, crash investigations, event reconstruction flight-critical systems self-forensics. paper introduces novel concept self-forensics complement standard autonomic self-chop properties self-managed systems, specified forensic lucid language. argue self-forensics, forensics taken cybercrime domain, applicable ""self-dissection"" purpose verification autonomous software hardware systems flight-critical systems automated incident anomaly analysis event reconstruction engineering teams variety incident scenarios design testing well actual flight data.",4 "thinking, learning, autonomous problem solving. ever increasing computational power require methods automatic programming. present alternative genetic programming, based general model thinking learning. advantage evolution takes place space constructs thus exploit mathematical structures space. model formalized, macro language presented allows formal yet intuitive description problem consideration. prototype developed implement scheme perl. method lead concentration analysis problems, rapid prototyping, treatment new problem classes, investigation philosophical problems. see fields application nonlinear differential equations, pattern recognition, robotics, model building, animated pictures.",4 "joint inference multiple label types large networks. tackle problem inferring node labels partially labeled graph node graph multiple label types label type large number possible labels. primary example, focus paper, joint inference label types hometown, current city, employers, users connected social network. standard label propagation fails consider properties label types interactions them. proposed method, called edgeexplain, explicitly models these, still enabling scalable inference distributed message-passing architecture. billion-node subset facebook social network, edgeexplain significantly outperforms label propagation several label types, lifts 120% recall@1 60% recall@3.",4 "consistent vertex nomination schemes. given vertex interest network $g_1$, vertex nomination problem seeks find corresponding vertex interest (if exists) second network $g_2$. although vertex nomination problem related tasks attracted much attention machine learning literature, applications social biological networks, framework far confined comparatively small class network models, concept statistically consistent vertex nomination schemes shallowly explored. paper, extend vertex nomination problem general statistical model graphs. further, drawing inspiration long-established classification framework pattern recognition literature, provide definitions key notions bayes optimality consistency extended vertex nomination framework, including derivation bayes optimal vertex nomination scheme. addition, prove universally consistent vertex nomination schemes exist. illustrative examples provided throughout.",19 "text2action: generative adversarial synthesis language action. paper, propose generative model learns relationship language human action order generate human action sequence given sentence describing human behavior. proposed generative model generative adversarial network (gan), based sequence sequence (seq2seq) model. using proposed generative network, synthesize various actions robot virtual agent using text encoder recurrent neural network (rnn) action decoder rnn. proposed generative network trained 29,770 pairs actions sentence annotations extracted msr-video-to-text (msr-vtt), large-scale video dataset. demonstrate network generate human-like actions transferred baxter robot, robot performs action based provided sentence. results show proposed generative network correctly models relationship language action generate diverse set actions sentence.",4 "leveraging path signature skeleton-based human action recognition. human action recognition videos one challenging tasks computer vision. one important issue design discriminative features representing spatial context temporal dynamics. here, introduce path signature feature encode information intra-frame inter-frame contexts. key step towards leveraging feature construct proper trajectories (paths) data steam. frame, correlated constraints human joints treated small paths, spatial path signature features extracted them. video data, evolution spatial features time also regarded paths temporal path signature features extracted. eventually, features concatenated constitute input vector fully connected neural network action classification. experimental results four standard benchmark action datasets, j-hmdb, sbu dataset, berkeley mhad, nturgb+d demonstrate proposed approach achieves state-of-the-art accuracy even comparison recent deep learning based models.",4 "time series analysis via matrix estimation. consider task interpolating forecasting time series presence noise missing data. main contribution work, introduce algorithm transforms observed time series matrix, utilizes singular value thresholding simultaneously recover missing values de-noise observed entries, performs linear regression make predictions. argue method provides meaningful imputation forecasting large class models: finite sum harmonics (which approximate stationary processes), non-stationary sublinear trends, linear time-invariant (lti) systems, additive mixtures. general, algorithm recovers hidden state dynamics based noisy observations, like hidden markov model (hmm), provided dynamics obey stated models. demonstrate synthetic real-world datasets algorithm outperforms standard software packages presence significantly missing data high levels noise, also packages given underlying model algorithm remains oblivious. line finite sample analysis model classes.",4 "manifold matching using shortest-path distance joint neighborhood selection. matching datasets multiple modalities become important task data analysis. existing methods often rely embedding transformation single modality without utilizing correspondence information, often results sub-optimal matching performance. paper, propose nonlinear manifold matching algorithm using shortest-path distance joint neighborhood selection. specifically, joint nearest-neighbor graph built modalities. shortest-path distance within modality calculated joint neighborhood graph, followed embedding matching common low-dimensional euclidean space. compared existing algorithms, approach exhibits superior performance matching disparate datasets multiple modalities.",19 "representation learning visual-relational knowledge graphs. visual-relational knowledge graph (kg) multi-relational graph whose entities associated images. introduce imagegraph, kg 1,330 relation types, 14,870 entities, 829,931 images. visual-relational kgs lead novel probabilistic query types images treated first-class citizens. prediction relations unseen images multi-relational image retrieval formulated query types visual-relational kg. approach problem answering queries novel combination deep convolutional networks models learning knowledge graph embeddings. resulting models answer queries ""how two unseen images related other?"" also explore zero-shot learning scenario image entirely new entity linked multiple relations entities existing kg. multi-relational grounding unseen entity images knowledge graph serves description entity. conduct experiments demonstrate proposed deep architectures combination kg embedding objectives answer visual-relational queries efficiently accurately.",4 "ssh: single stage headless face detector. introduce single stage headless (ssh) face detector. unlike two stage proposal-classification detectors, ssh detects faces single stage directly early convolutional layers classification network. ssh headless. is, able achieve state-of-the-art results removing ""head"" underlying classification network -- i.e. fully connected layers vgg-16 contains large number parameters. additionally, instead relying image pyramid detect faces various scales, ssh scale-invariant design. simultaneously detect faces different scales single forward pass network, different layers. properties make ssh fast light-weight. surprisingly, headless vgg-16, ssh beats resnet-101-based state-of-the-art wider dataset. even though, unlike current state-of-the-art, ssh use image pyramid 5x faster. moreover, image pyramid deployed, light-weight network achieves state-of-the-art subsets wider dataset, improving ap 2.5%. ssh also reaches state-of-the-art results fddb pascal-faces datasets using small input size, leading runtime 50 ms/image gpu. code available https://github.com/mahyarnajibi/ssh.",4 "decentralized supply chain formation: market protocol competitive equilibrium analysis. supply chain formation process determining structure terms exchange relationships enable multilevel, multiagent production activity. present simple model supply chains, highlighting two characteristic features: hierarchical subtask decomposition, resource contention. decentralize formation process, introduce market price system resources produced along chain. competitive equilibrium system, agents choose locally optimal allocations respect prices, outcomes optimal overall. determine prices, define market protocol based distributed, progressive auctions, myopic, non-strategic agent bidding policies. presence resource contention, protocol produces better solutions greedy protocols common artificial intelligence multiagent systems literature. protocol often converges high-value supply chains, competitive equilibria exist, typically approximate competitive equilibria. however, complementarities agent production technologies cause protocol wastefully allocate inputs agents produce outputs. subsequent decommitment phase recovers significant fraction lost surplus.",4 "multi-objective contextual bandit problem similarity information. paper propose multi-objective contextual bandit problem similarity information. problem extends classical contextual bandit problem similarity information introducing multiple possibly conflicting objectives. since best arm objective different given context, learning best arm based single objective jeopardize rewards obtained objectives. order evaluate performance learner setup, use performance metric called contextual pareto regret. essentially, contextual pareto regret sum distances arms chosen learner context dependent pareto front. problem, develop new online learning algorithm called pareto contextual zooming (pcz), exploits idea contextual zooming learn arms close pareto front observed context adaptively partitioning joint context-arm set according observed rewards locations context-arm pairs selected past. then, prove pcz achieves $\tilde (t^{(1+d_p)/(2+d_p)})$ pareto regret $d_p$ pareto zooming dimension depends size set near-optimal context-arm pairs. moreover, show regret bound nearly optimal providing almost matching $\omega (t^{(1+d_p)/(2+d_p)})$ lower bound.",19 "algorithms closed rational behavior (curb) sets. provide series algorithms demonstrating solutions according fundamental game-theoretic solution concept closed rational behavior (curb) sets two-player, normal-form games computed polynomial time (we also discuss extensions n-player games). first, describe algorithm identifies player's best responses conditioned belief player play within given subset strategy space. algorithm serves subroutine series polynomial-time algorithms finding minimal curb sets, one minimal curb set, smallest minimal curb set game. show complexity finding nash equilibrium exponential size game's smallest curb set. related this, show smallest curb set arbitrarily small portion game, also arbitrarily larger supports enclosed nash equilibrium. test algorithms empirically find commonly studied academic games tend either large small minimal curb sets.",4 "convolutional kernel networks. important goal visual recognition devise image representations invariant particular transformations. paper, address goal new type convolutional neural network (cnn) whose invariance encoded reproducing kernel. unlike traditional approaches neural networks learned either represent data solving classification task, network learns approximate kernel feature map training data. approach enjoys several benefits classical ones. first, teaching cnns invariant, obtain simple network architectures achieve similar accuracy complex ones, easy train robust overfitting. second, bridge gap neural network literature kernels, natural tools model invariance. evaluate methodology visual recognition tasks cnns proven perform well, e.g., digit recognition mnist dataset, challenging cifar-10 stl-10 datasets, accuracy competitive state art.",4 text analysis tools spoken language processing. submission contains postscript final version slides used acl-94 tutorial.,2 "deep reinforcement learning raw pixels doom. using current reinforcement learning methods, recently become possible learn play unknown 3d games raw pixels. work, study challenges arise complex environments, summarize current methods approach these. choose task within doom game, approached yet. goal agent fight enemies 3d world consisting five rooms. train dqn lstm-a3c algorithms task. results show algorithms learn sensible policies, fail achieve high scores given amount training. provide insights learned behavior, serve valuable starting point research doom domain.",4 "spatial random sampling: structure-preserving data sketching tool. random column sampling guaranteed yield data sketches preserve underlying structures data may sample sufficiently less-populated data clusters. also, adaptive sampling often provide accurate low rank approximations, yet may fall short producing descriptive data sketches, especially cluster centers linearly dependent. motivated that, paper introduces novel randomized column sampling tool dubbed spatial random sampling (srs), data points sampled based proximity randomly sampled points unit sphere. compelling feature srs corresponding probability sampling given data cluster proportional surface area cluster occupies unit sphere, independently size cluster population. although fully randomized, srs shown provide descriptive balanced data representations. proposed idea addresses pressing need data science holds potential inspire many novel approaches analysis big data.",4 "extracting temporal causal relations events. structured information resulting temporal information processing crucial variety natural language processing tasks, instance generate timeline summarization events news documents, answer temporal/causal-related questions events. thesis present framework integrated temporal causal relation extraction system. first develop robust extraction component type relations, i.e. temporal order causality. combine two extraction components integrated relation extraction system, catena---causal temporal relation extraction natural language texts---, utilizing presumption event precedence causality, causing events must happened resulting events. several resources techniques improve relation extraction systems also discussed, including word embeddings training data expansion. finally, report adaptation efforts temporal information processing languages english, namely italian indonesian.",4 "verification generalized inconsistency-aware knowledge action bases (extended version). knowledge action bases (kabs) put forward semantically rich representation domain, using dl kb account static aspects, actions evolve extensional part time, possibly introducing new objects. recently, kabs extended manage inconsistency, ad-hoc verification techniques geared towards specific semantics. work provides twofold contribution along line research. one hand, enrich kabs high-level, compact action language inspired golog, obtaining called golog-kabs (gkabs). hand, introduce parametric execution semantics gkabs, elegantly accomodate plethora inconsistency-aware semantics based notion repair. provide several reductions verification sophisticated first-order temporal properties inconsistency-aware gkabs, show addressed using known techniques, developed standard kabs.",4 "by-passing kohn-sham equations machine learning. last year, least 30,000 scientific papers used kohn-sham scheme density functional theory solve electronic structure problems wide variety scientific fields, ranging materials science biochemistry astrophysics. machine learning holds promise learning kinetic energy functional via examples, by-passing need solve kohn-sham equations. yield substantial savings computer time, allowing either larger systems longer time-scales tackled, attempts machine-learn functional limited need find derivative. present work overcomes difficulty directly learning density-potential energy-density maps test systems various molecules. improved accuracy lower computational cost method demonstrated reproducing dft energies range molecular geometries generated molecular dynamics simulations. moreover, methodology could applied directly quantum chemical calculations, allowing construction density functionals quantum-chemical accuracy.",15 "reinforcement learning using quantum boltzmann machines. investigate whether quantum annealers select chip layouts outperform classical computers reinforcement learning tasks. associate transverse field ising spin hamiltonian layout qubits similar deep boltzmann machine (dbm) use simulated quantum annealing (sqa) numerically simulate quantum sampling system. design reinforcement learning algorithm set visible nodes representing states actions optimal policy first last layers deep network. absence transverse field, simulations show dbms train effectively restricted boltzmann machines (rbm) number weights. since sampling boltzmann distributions dbm classically feasible, evidence advantage non-turing sampling oracle. develop framework training network quantum boltzmann machine (qbm) presence significant transverse field reinforcement learning. improves reinforcement learning method using dbms.",18 "associative content-addressable networks exponentially many robust stable states. brain must robustly store large number memories, corresponding many events encountered lifetime. however, number memory states existing neural network models either grows weakly network size recall fails catastrophically vanishingly little noise. construct associative content-addressable memory exponentially many stable states robust error-correction. network possesses expander graph connectivity restricted boltzmann machine architecture. expansion property allows simple neural network dynamics perform par modern error-correcting codes. appropriate networks constructed sparse random connections, glomerular nodes, associative learning using low dynamic-range weights. thus, sparse quasi-random structures---characteristic important error-correcting codes---may provide high-performance computation artificial neural networks brain.",16 "scalability neural control musculoskeletal robots. anthropomimetic robots robots sense, behave, interact feel like humans. definition, anthropomimetic robots require human-like physical hardware actuation, also brain-like control sensing. self-evident realization meet requirements would human-like musculoskeletal robot brain-like neural controller. musculoskeletal robotic hardware neural control software existed decades, scalable approach could used build control anthropomimetic human-scale robot demonstrated yet. combining myorobotics, framework musculoskeletal robot development, spinnaker, neuromorphic computing platform, present proof-of-principle system scale dozens neurally-controlled, physically compliant joints. core, implements closed-loop cerebellar model provides real-time low-level neural control minimal power consumption maximal extensibility: higher-order (e.g., cortical) neural networks neuromorphic sensors like silicon-retinae -cochleae naturally incorporated.",4 "shading local shape. develop framework extracting concise representation shape information available diffuse shading small image patch. produces mid-level scene descriptor, comprised local shape distributions inferred separately every image patch across multiple scales. framework based quadratic representation local shape that, absence noise, guarantees recovering accurate local shape lighting. noise present, inferred local shape distributions provide useful shape information without over-committing particular image explanation. local shape distributions naturally encode fact smooth diffuse regions informative others, enable efficient robust reconstruction object-scale shape. experimental results show approach surface reconstruction compares well state-of-art synthetic images captured photographs.",4 "deep learning methods efficient large scale video labeling. present solution ""google cloud youtube-8m video understanding challenge"" ranked 5th place. proposed model ensemble three model families, two frame level one video level. training performed augmented dataset, cross validation.",19 "view adaptive recurrent neural networks high performance human action recognition skeleton data. skeleton-based human action recognition recently attracted increasing attention due popularity 3d skeleton data. one main challenge lies large view variations captured human actions. propose novel view adaptation scheme automatically regulate observation viewpoints occurrence action. rather re-positioning skeletons based human defined prior criterion, design view adaptive recurrent neural network (rnn) lstm architecture, enables network adapt suitable observation viewpoints end end. extensive experiment analyses show proposed view adaptive rnn model strives (1) transform skeletons various views much consistent viewpoints (2) maintain continuity action rather transforming every frame position body orientation. model achieves significant improvement state-of-the-art approaches three benchmark datasets.",4 "predictive-state decoders: encoding future recurrent networks. recurrent neural networks (rnns) vital modeling technique rely internal states learned indirectly optimization supervised, unsupervised, reinforcement training loss. rnns used model dynamic processes characterized underlying latent states whose form often unknown, precluding analytic representation inside rnn. predictive-state representation (psr) literature, latent state processes modeled internal state representation directly models distribution future observations, recent work area relied explicitly representing targeting sufficient statistics probability distribution. seek combine advantages rnns psrs augmenting existing state-of-the-art recurrent neural networks predictive-state decoders (psds), add supervision network's internal state representation target predicting future observations. predictive-state decoders simple implement easily incorporated existing training pipelines via additional loss regularization. demonstrate effectiveness psds experimental results three different domains: probabilistic filtering, imitation learning, reinforcement learning. each, method improves statistical performance state-of-the-art recurrent baselines fewer iterations less data.",19 "deep bcd-net using identical encoding-decoding cnn structures iterative image recovery. ""extreme"" computational imaging collects extremely undersampled noisy measurements, obtaining accurate image within reasonable computing time challenging. incorporating image mapping convolutional neural networks (cnn) iterative image recovery great potential resolve issue. paper 1) incorporates image mapping cnn using identical convolutional kernels encoders decoders block coordinate descent (bcd) optimization method -- referred bcd-net using identical encoding-decoding cnn structures -- 2) applies alternating direction method multipliers train proposed bcd-net. numerical experiments show that, a) denoising moderately low signal-to-noise-ratio images b) extremely undersampled magnetic resonance imaging, proposed bcd-net achieves (significantly) accurate image recovery, compared bcd-net using distinct encoding-decoding structures and/or conventional image recovery model using wavelets total variation.",19 "accurate localization dense urban area using google street view image. accurate information location orientation camera mobile devices central utilization location-based services (lbs). mobile devices rely gps data data subject inaccuracy due imperfections quality signal provided satellites. shortcoming spurred research improving accuracy localization. since mobile devices camera, major thrust research seeks acquire local scene apply image retrieval techniques querying gps-tagged image database find best match acquired scene.. techniques however computationally demanding unsuitable real-time applications assistive technology navigation blind visually impaired motivated work. overcome high complexity techniques, investigated use inertial sensors aid image-retrieval-based approach. armed information media images, data gps module along orientation sensors accelerometer gyro, sought limit size image set c search best match. specifically, data orientation sensors along dilution precision (dop) gps used find angle view estimation position. present analysis reduction image set size search well simulations demonstrate effectiveness fast implementation 98% estimated position error.",4 "unsupervised learning regression mixture models unknown number components. regression mixture models widely studied statistics, machine learning data analysis. fitting regression mixtures challenging usually performed maximum likelihood using expectation-maximization (em) algorithm. however, well-known initialization crucial em. initialization inappropriately performed, em algorithm may lead unsatisfactory results. em algorithm also requires number clusters given priori; problem selecting number mixture components requires using model selection criteria choose one set pre-estimated candidate models. propose new fully unsupervised algorithm learn regression mixture models unknown number components. developed unsupervised learning approach consists penalized maximum likelihood estimation carried robust expectation-maximization (em) algorithm fitting polynomial, spline b-spline regressions mixtures. proposed learning approach fully unsupervised: 1) simultaneously infers model parameters optimal number regression mixture components data learning proceeds, rather two-fold scheme standard model-based clustering using afterward model selection criteria, 2) require accurate initialization unlike standard em regression mixtures. developed approach applied curve clustering problems. numerical experiments simulated data show proposed robust em algorithm performs well provides accurate results terms robustness regard initialization retrieving optimal partition actual number clusters. application real data framework functional data clustering, confirms benefit proposed approach practical applications.",19 "conversion artificial recurrent neural networks spiking neural networks low-power neuromorphic hardware. recent years field neuromorphic low-power systems consume orders magnitude less power gained significant momentum. however, wider use still hindered lack algorithms harness strengths architectures. neuromorphic adaptations representation learning algorithms emerging, efficient processing temporal sequences variable length-inputs remain difficult. recurrent neural networks (rnn) widely used machine learning solve variety sequence learning tasks. work present train-and-constrain methodology enables mapping machine learned (elman) rnns substrate spiking neurons, compatible capabilities current near-future neuromorphic systems. ""train-and-constrain"" method consists first training rnns using backpropagation time, discretizing weights finally converting spiking rnns matching responses artificial neurons spiking neurons. demonstrate approach mapping natural language processing task (question classification), demonstrate entire mapping process recurrent layer network ibm's neurosynaptic system ""truenorth"", spike-based digital neuromorphic hardware architecture. truenorth imposes specific constraints connectivity, neural synaptic parameters. satisfy constraints, necessary discretize synaptic weights neural activities 16 levels, limit fan-in 64 inputs. find short synaptic delays sufficient implement dynamical (temporal) aspect rnn question classification task. hardware-constrained model achieved 74% accuracy question classification using less 0.025% cores one truenorth chip, resulting estimated power consumption ~17 uw.",4 "network structure dynamics, emergence robustness stabilizing selection artificial genome. genetic regulation key component development, clear understanding structure dynamics genetic networks yet hand. work investigate properties within artificial genome model originally introduced reil. analyze statistical properties randomly generated genomes sequence- network level, show model correctly predicts frequency genes genomes found experimental data. using evolutionary algorithm based stabilizing selection phenotype, show robustness single base mutations, well random changes initial network states mimic stochastic fluctuations environmental conditions, emerge parallel. evolved genomes exhibit characteristic patterns sequence network level.",16 "automatic network reconstruction using asp. building biological models inferring functional dependencies experimental data im- portant issue molecular biology. relieve biologist traditionally manual process, various approaches proposed increase degree automation. however, available ap- proaches often yield single model only, rely specific assumptions, and/or use dedicated, heuris- tic algorithms intolerant changing circumstances requirements view rapid progress made biotechnology. aim provide declarative solution problem ap- peal answer set programming (asp) overcoming difficulties. build upon existing approach automatic network reconstruction proposed part authors. approach firm mathematical foundations well suited asp due combinatorial flavor providing characterization models explaining set experiments. usage asp several ben- efits existing heuristic algorithms. first, declarative thus transparent biological experts. second, elaboration tolerant thus allows easy exploration incorporation biological constraints. third, allows exploring entire space possible models. finally, approach offers excellent performance, matching existing, special-purpose systems.",4 "robustly learning gaussian: getting optimal error, efficiently. study fundamental problem learning parameters high-dimensional gaussian presence noise -- $\varepsilon$-fraction samples chosen adversary. give robust estimators achieve estimation error $o(\varepsilon)$ total variation distance, optimal universal constant independent dimension. case mean unknown, robustness guarantee optimal factor $\sqrt{2}$ running time polynomial $d$ $1/\epsilon$. mean covariance unknown, running time polynomial $d$ quasipolynomial $1/\varepsilon$. moreover algorithms require polynomial number samples. work shows sorts error guarantees established fifty years ago one-dimensional setting also achieved efficient algorithms high-dimensional settings.",4 "verifiability argumentation semantics. dung's abstract argumentation theory widely used formalism model conflicting information draw conclusions situations. hereby, knowledge represented so-called argumentation frameworks (afs) reasoning done via semantics extracting acceptable sets. reasonable semantics based notion conflict-freeness means arguments jointly acceptable linked within af. paper, study question information top conflict-free sets needed compute extensions semantics hand. introduce hierarchy so-called verification classes specifying required amount information. show well-known standard semantics exactly verifiable certain class. framework also gives means study semantics lying inbetween known semantics, thus contributing abstract understanding different features argumentation semantics offer.",4 "fastderain: novel video rain streak removal method using directional gradient priors. rain streak removal important issue outdoor vision systems recently investigated extensively. paper, propose novel video rain streak removal approach fastderain, fully considers discriminative characteristics rain streaks clean video gradient domain. specifically, one hand, rain streaks sparse smooth along direction raindrops, whereas hand, clean videos exhibit piecewise smoothness along rain-perpendicular direction continuity along temporal direction. theses smoothness continuity results sparse distribution different directional gradient domain, respectively. thus, minimize 1) $\ell_1$ norm enhance sparsity underlying rain streaks, 2) two $\ell_1$ norm unidirectional total variation (tv) regularizers guarantee anisotropic spatial smoothness, 3) $\ell_1$ norm time-directional difference operator characterize temporal continuity. split augmented lagrangian shrinkage algorithm (salsa) based algorithm designed solve proposed minimization model. experiments conducted synthetic real data demonstrate effectiveness efficiency proposed method. according comprehensive quantitative performance measures, approach outperforms state-of-the-art methods especially account running time.",4 "consistent kernel mean estimation functions random variables. provide theoretical foundation non-parametric estimation functions random variables using kernel mean embeddings. show continuous function $f$, consistent estimators mean embedding random variable $x$ lead consistent estimators mean embedding $f(x)$. mat\'ern kernels sufficiently smooth functions also provide rates convergence. results extend functions multiple random variables. variables dependent, require estimator mean embedding joint distribution starting point; independent, sufficient separate estimators mean embeddings marginal distributions. either case, results cover mean embeddings based i.i.d. samples well ""reduced set"" expansions terms dependent expansion points. latter serves justification using expansions limit memory resources applying approach basis probabilistic programming.",19 "heinrich behmann's contributions second-order quantifier elimination view computational logic. relational monadic formulas (the l\""owenheim class) second-order quantifier elimination, closely related computation uniform interpolants, projection forgetting - operations currently receive much attention knowledge processing - always succeeds. decidability proof class heinrich behmann 1922 explicitly proceeds elimination equivalence preserving formula rewriting. reconstruct results behmann's publication detail discuss related issues relevant context modern approaches second-order quantifier elimination computational logic. addition, extensive documentation letters manuscripts behmann's bequest concern second-order quantifier elimination given, including commented register english abstracts german sources focus technical material. late 1920s behmann attempted develop elimination-based decision method formulas predicates whose arity larger one. manuscripts correspondence wilhelm ackermann show technical aspects still interest today give insight genesis ackermann's landmark paper ""untersuchungen \""uber das eliminationsproblem der mathematischen logik"" 1935, laid foundation two prevailing modern approaches second-order quantifier elimination.",4 "intrusion detection smartphones. smartphone technology becoming predominant communication tool people across world. people use smartphones keep contact data, browse internet, exchange messages, keep notes, carry personal files documents, etc. users browsing also capable shopping online, thus provoking need type credit card numbers security codes. smartphones becoming widespread security threats vulnerabilities facing technology. recent news articles indicate huge increase malware viruses operating systems employed smartphones (primarily android ios). major limitations smartphone technology processing power scarce energy source since smartphones rely battery usage. since smartphones devices change network location user moves different places, intrusion detection systems smartphone technology often classified idss designed mobile ad-hoc networks. aim research give brief overview ids technology, give overview major machine learning pattern recognition algorithms used ids technologies, give overview security models ios android propose new host-based ids model smartphones create proof-of-concept application android platform newly proposed model. keywords: ids, svm, android, ios;",4 "probabilistic interpretation linear solvers. manuscript proposes probabilistic framework algorithms iteratively solve unconstrained linear problems $bx = b$ positive definite $b$ $x$. goal replace point estimates returned existing methods gaussian posterior belief elements inverse $b$, used estimate errors. recent probabilistic interpretations secant family quasi-newton optimization algorithms extended. combined properties conjugate gradient algorithm, leads uncertainty-calibrated methods limited cost overhead conjugate gradients, self-contained novel interpretation quasi-newton conjugate gradient algorithms, foundation new nonlinear optimization methods.",12 "computational estimate visualisation evaluation agent classified rules learning system. student modelling agent classified rules learning applied development intelligent preassessment system presented [10],[11]. paper, demystify theory behind development pre-assessment system followed computational experimentation graph visualisation agent classified rules learning algorithm estimation prediction classified rules. addition, present preliminary results pre-assessment system evaluation. results, gathered system performed according design specification.",4 "general methodology determination 2d bodies elastic deformation invariants. application automatic identification parasites. novel methodology introduced exploits 2d images arbitrary elastic body deformation instances, quantify mechano-elastic characteristics deformation invariant. determination characteristics allows developing methods offering image undeformed body. general assumptions mechano-elastic properties bodies stated, lead two different approaches obtaining bodies' deformation invariants. one developed spot deformed body's neutral line cross sections, solves deformation pdes performing set equivalent image operations deformed body images. processes may furnish body undeformed version deformed image. confirmed obtaining undeformed shape deformed parasites, cells (protozoa), fibers human lips. addition, method applied important problem parasite automatic classification microscopic images. achieve this, first apply previous method straighten highly deformed parasites apply dedicated curve classification method straightened parasite contours. demonstrated essentially different deformations parasite give rise practically undeformed shape, thus confirming consistency introduced methodology. finally, developed pattern recognition method classifies unwrapped parasites 6 families, accuracy rate 97.6 %.",4 disentangled representations manipulation sentiment text. ability change arbitrary aspects text leaving core message intact could strong impact fields like marketing politics enabling e.g. automatic optimization message impact personalized language adapted receiver's profile. paper take first step towards system presenting algorithm manipulate sentiment text preserving semantics using disentangled representations. validation performed examining trajectories embedding space analyzing transformed sentences semantic preservation expression desired sentiment shift.,4 "in-bed pose estimation: deep learning shallow dataset. although human pose estimation various computer vision (cv) applications studied extensively last decades, yet in-bed pose estimation using camera-based vision methods ignored cv community assumed identical general purpose pose estimation methods. however, in-bed pose estimation specialized aspects comes specific challenges including notable differences lighting conditions throughout day also different pose distribution common human surveillance viewpoint. paper, demonstrate challenges significantly lessen effectiveness existing general purpose pose estimation models. order address lighting variation challenge, infrared selective (irs) image acquisition technique proposed provide uniform quality data various lighting conditions. deep learning framework proves effective model human pose estimation, however lack large public dataset in-bed poses prevents us using large network scratch. work, explored idea employing pre-trained convolutional neural network (cnn) model trained large public datasets general human poses fine-tuning model using shallow (limited size different perspective color) in-bed irs dataset. developed irs imaging system collected irs image data several realistic life-size mannequins simulated hospital room environment. pre-trained cnn called convolutional pose machine (cpm) repurposed in-bed pose estimation fine-tuning specific intermediate layers. using hog rectification method, pose estimation performance cpm significantly improved 26.4% pck0.1 criteria compared model without rectification.",4 "label efficient learning exploiting multi-class output codes. present new perspective popular multi-class algorithmic techniques one-vs-all error correcting output codes. rather studying behavior techniques supervised learning, establish connection success methods existence label-efficient learning procedures. show realizable agnostic cases, output codes successful learning labeled data, implicitly assume structure classes related. making structure explicit, design learning algorithms recover classes low label complexity. provide results commonly studied cases one-vs-all learning codewords classes well separated. additionally consider challenging case codewords well separated, satisfy boundary features condition captures natural intuition every bit codewords significant.",4 "optimal learning sequential decision making expensive cost functions stochastic binary feedbacks. consider problem sequentially making decisions rewarded ""successes"" ""failures"" predicted unknown relationship depends partially controllable vector attributes instance. learner takes active role selecting samples instance pool. goal maximize probability success either offline (training) online (testing) phases. problem motivated real-world applications observations time-consuming and/or expensive. develop knowledge gradient policy using online bayesian linear classifier guide experiment maximizing expected value information labeling alternative. provide finite-time analysis estimated error show maximum likelihood estimator based produced kg policy consistent asymptotically normal. also show knowledge gradient policy asymptotically optimal offline setting. work extends knowledge gradient setting contextual bandits. report results series experiments demonstrate efficiency.",19 "distant ie bootstrapping using lists document structure. distant labeling information extraction (ie) suffers noisy training data. describe way reducing noise associated distant ie identifying coupling constraints potential instance labels. one example coupling, items list likely label. second example coupling comes analysis document structure: corpora, sections identified items section likely label. sections exist corpora, show augmenting large corpus coupling constraints even small, well-structured corpus improve performance substantially, doubling f1 one task.",4 "coherent online video style transfer. training feed-forward network fast neural style transfer images proven successful. however, naive extension process video frame frame prone producing flickering results. propose first end-to-end network online video style transfer, generates temporally coherent stylized video sequences near real-time. two key ideas include efficient network incorporating short-term coherence, propagating short-term coherence long-term, ensures consistency larger period time. network incorporate different image stylization networks. show proposed method clearly outperforms per-frame baseline qualitatively quantitatively. moreover, achieve visually comparable coherence optimization-based video style transfer, three orders magnitudes faster runtime.",4 "uavs using bayesian optimization locate wifi devices. address problem localizing non-collaborative wifi devices large region. main motive localize humans localizing wifi devices, e.g. search-and-rescue operations natural disaster. use active sensing approach relies unmanned aerial vehicles (uavs) collect signal-strength measurements informative locations. problem challenging since measurement received arbitrary times received uav close proximity device. reasons, extremely important make prudent decision measurements. use bayesian optimization approach based gaussian process (gp) regression. approach works well application since gps give reliable predictions measurements bayesian optimization makes judicious trade-off exploration exploitation. field experiments conducted region 1000 $\times$ 1000 $m^2$, show approach reduces search area less 100 meters around wifi device within 5 minutes only. overall, approach localizes device less 15 minutes error less 20 meters.",4 "robust text detection natural scene images. text detection natural scene images important prerequisite many content-based image analysis tasks. paper, propose accurate robust method detecting texts natural scene images. fast effective pruning algorithm designed extract maximally stable extremal regions (msers) character candidates using strategy minimizing regularized variations. character candidates grouped text candidates ingle-link clustering algorithm, distance weights threshold clustering algorithm learned automatically novel self-training distance metric learning algorithm. posterior probabilities text candidates corresponding non-text estimated character classifier; text candidates high probabilities eliminated finally texts identified text classifier. proposed system evaluated icdar 2011 robust reading competition dataset; f measure 76% significantly better state-of-the-art performance 71%. experimental results publicly available multilingual dataset also show proposed method outperform competitive method f measure increase 9 percent. finally, setup online demo proposed scene text detection system http://kems.ustb.edu.cn/learning/yin/dtext.",4 "orthogonal idempotent transformations learning deep neural networks. identity transformations, used skip-connections residual networks, directly connect convolutional layers close input close output deep neural networks, improving information flow thus easing training. paper, introduce two alternative linear transforms, orthogonal transformation idempotent transformation. according definition property orthogonal idempotent matrices, product multiple orthogonal (same idempotent) matrices, used form linear transformations, equal single orthogonal (idempotent) matrix, resulting information flow improved training eased. one interesting point success essentially stems feature reuse gradient reuse forward backward propagation maintaining information flow eliminating gradient vanishing problem express way skip-connections. empirically demonstrate effectiveness proposed two transformations: similar performance single-branch networks even superior multi-branch networks comparison identity transformations.",4 "pixelnn: example-based image synthesis. present simple nearest-neighbor (nn) approach synthesizes high-frequency photorealistic images ""incomplete"" signal low-resolution image, surface normal map, edges. current state-of-the-art deep generative models designed conditional image synthesis lack two important things: (1) unable generate large set diverse outputs, due mode collapse problem. (2) interpretable, making difficult control synthesized output. demonstrate nn approaches potentially address limitations, suffer accuracy small datasets. design simple pipeline combines best worlds: first stage uses convolutional neural network (cnn) maps input (overly-smoothed) image, second stage uses pixel-wise nearest neighbor method map smoothed output multiple high-quality, high-frequency outputs controllable manner. demonstrate approach various input modalities, various domains ranging human faces cats-and-dogs shoes handbags.",4 "revisiting problem mobile robot map building: hierarchical bayesian approach. present application hierarchical bayesian estimation robot map building. revisiting problem occurs robot decide whether seeing previously-built portion map, exploring new territory. difficult decision problem, requiring probability outside current known map. estimate probability, model structure ""typical"" environment hidden markov model generates sequences views observed robot navigating environment. dirichlet prior structural models learned previously explored environments. whenever robot explores new environment, posterior model estimated dirichlet hyperparameters. approach implemented tested context multi-robot map merging, particularly difficult instance revisiting problem. experiments robot data show technique yields strong improvements alternative methods.",4 "reducing drift visual odometry inferring sun direction using bayesian convolutional neural network. present method incorporate global orientation information sun visual odometry pipeline using existing image stream, sun typically visible. leverage recent advances bayesian convolutional neural networks train implement sun detection model infers three-dimensional sun direction vector single rgb image. crucially, method also computes principled uncertainty associated prediction, using monte carlo dropout scheme. incorporate uncertainty sliding window stereo visual odometry pipeline accurate uncertainty estimates critical optimal data fusion. bayesian sun detection model achieves median error approximately 12 degrees kitti odometry benchmark training set, yields improvements 42% translational armse 32% rotational armse compared standard vo. open source implementation bayesian cnn sun estimator (sun-bcnn) using caffe available https://github. com/utiasstars/sun-bcnn-vo",4 "punny captions: witty wordplay image descriptions. wit quintessential form rich inter-human interaction, often grounded specific situation (e.g., comment response event). work, attempt build computational models produce witty descriptions given image. inspired cognitive account humor appreciation, employ linguistic wordplay, specifically puns. compare approach meaningful baseline approaches via human studies. turing test style evaluation, people find model's description image wittier human's witty description 55% time!",4 "convsrc: smartphone based periocular recognition using deep convolutional neural network sparsity augmented collaborative representation. smartphone based periocular recognition gained significant attention biometric research community limitations biometric modalities like face, iris etc. existing methods periocular recognition employ hand-crafted features. recently, learning based image representation techniques like deep convolutional neural network (cnn) shown outstanding performance many visual recognition tasks. cnn needs huge volume data learning, periocular recognition limited amount data available. solution use cnn pre-trained dataset related domain, case challenge extract efficiently discriminative features. using pertained cnn model (vgg-net), propose simple, efficient compact image representation technique takes account wealth information sparsity existing activations convolutional layers employs principle component analysis. recognition, use efficient robust sparse augmented collaborative representation based classification (sa-crc) technique. thorough evaluation convsrc (the proposed system), experiments carried visob challenging database presented periocular recognition competition icip2016. obtained results show superiority convsrc state-of-the-art methods; obtains gmr 99% fmr = 10-3 outperforms first winner icip2016 challenge 10%.",4 "efficient circle detection scheme digital images using ant system algorithm. detection geometric features digital images important exercise image analysis computer vision. hough transform techniques detection circles require huge memory space data processing hence requiring lot time computing locations data space, writing searching memory space. paper propose novel efficient scheme detecting circles edge-detected grayscale digital images. use ant-system algorithm purpose yet found much application field. main feature scheme detect intersecting well non-intersecting circles time efficiency makes useful real time applications. build ant system new type finds closed loops image tests circles.",4 "survey techniques improving generalization ability genetic programming solutions. field empirical modeling using genetic programming (gp), important evolve solution good generalization ability. generalization ability gp solutions get affected two important issues: bloat over-fitting. surveyed classified existing literature related different techniques used gp research community deal issues. also point limitation techniques, any. moreover, classification different bloat control approaches measures bloat over-fitting also discussed. believe work useful gp practitioners following ways: (i) better understand concepts generalization gp (ii) comparing existing bloat over-fitting control techniques (iii) selecting appropriate approach improve generalization ability gp evolved solutions.",4 approximate kalman filter q-learning continuous state-space mdps. seek learn effective policy markov decision process (mdp) continuous states via q-learning. given set basis functions state action pairs search corresponding set linear weights minimizes mean bellman residual. algorithm uses kalman filter model estimate weights developed simpler approximate kalman filter model outperforms current state art projected td-learning methods several standard benchmark problems.,4 "power asymmetry binary hashing. approximating binary similarity using hamming distance short binary hashes, show even similarity symmetric, shorter accurate hashes using two distinct code maps. i.e. approximating similarity $x$ $x'$ hamming distance $f(x)$ $g(x')$, two distinct binary codes $f,g$, rather hamming distance $f(x)$ $f(x')$.",4 "genealogical distance diversity estimate evolutionary algorithms. evolutionary edit distance two individuals population, i.e., amount applications genetic operator would take evolutionary process generate one individual starting other, seems like promising estimate diversity said individuals. introduce genealogical diversity, i.e., estimating two individuals' degree relatedness analyzing large, unused parts genome, computationally efficient method approximate measure diversity.",4 "turnover prediction shares using data mining techniques : case study. predicting turnover company ever fluctuating stock market always proved precarious situation certainly difficult task hand. data mining well-known sphere computer science aims extracting meaningful information large databases. however, despite existence many algorithms purpose predicting future trends, efficiency questionable predictions suffer high error rate. objective paper investigate various classification algorithms predict turnover different companies based stock price. authorized dataset predicting turnover taken www.bsc.com included stock market values various companies past 10 years. algorithms investigated using ""r"" tool. feature selection algorithm, boruta, run dataset extract important influential features classification. extracted features, total turnover company predicted using various classification algorithms like random forest, decision tree, svm multinomial regression. prediction mechanism implemented predict turnover company everyday basis hence could help navigate dubious stock market trades. accuracy rate 95% achieved prediction process. moreover, importance stock market attributes established well.",4 "generative adversarial networks using adaptive convolution. existing gans architectures generate images use transposed convolution resize-convolution upsampling algorithm lower higher resolution feature maps generator. argue kind fixed operation problematic gans model objects different visual appearances. propose novel adaptive convolution method learns upsampling algorithm based local context location address problem. modify baseline gans architecture replacing normal convolutions adaptive convolutions generator. experiments cifar-10 dataset show modified models improve baseline model large margin. furthermore, models achieve state-of-the-art performance cifar-10 stl-10 datasets unsupervised setting.",4 "nonextensive information theoretical machine. paper, propose new discriminative model named \emph{nonextensive information theoretical machine (nitm)} based nonextensive generalization shannon information theory. nitm, weight parameters treated random variables. tsallis divergence used regularize distribution weight parameters maximum unnormalized tsallis entropy distribution used evaluate fitting effect. one hand, showed well-known margin-based loss functions $\ell_{0/1}$ loss, hinge loss, squared hinge loss exponential loss unified unnormalized tsallis entropy. hand, gaussian prior regularization generalized student-t prior regularization similar computational complexity. model solved efficiently gradient-based convex optimization performance illustrated standard datasets.",4 "efficient watermarking algorithm improve payload robustness without affecting image perceptual quality. capacity, robustness, & perceptual quality watermark data important issues considered. lot research going increase parameters watermarking digital images, always tradeoff among them. . paper efficient watermarking algorithm improve payload robustness without affecting perceptual quality image data based dwt discussed. aim paper employ nested watermarks wavelet domain increases capacity ultimately robustness attacks selection different scaling factor values & hh bands embedding create visible artifacts original image therefore original watermarked image similar.",4 "shape texture using locally scaled point processes. shape texture refers extraction 3d information 2d images irregular texture. paper introduces statistical framework learn shape texture convex texture elements 2d image represented point process. first step, 2d image preprocessed generate probability map corresponding estimate unnormalized intensity latent point process underlying texture elements. latent point process subsequently inferred probability map non-parametric, model free manner. finally, 3d information extracted point pattern applying locally scaled point process model local scaling function represents deformation caused projection 3d surface onto 2d image.",19 "relativistic monte carlo. hamiltonian monte carlo (hmc) popular markov chain monte carlo (mcmc) algorithm generates proposals metropolis-hastings algorithm simulating dynamics hamiltonian system. however, hmc sensitive large time discretizations performs poorly mismatch spatial geometry target distribution scales momentum distribution. particular mass matrix hmc hard tune well. order alleviate problems propose relativistic hamiltonian monte carlo, version hmc based relativistic dynamics introduce maximum velocity particles. also derive stochastic gradient versions algorithm show resulting algorithms bear interesting relationships gradient clipping, rmsprop, adagrad adam, popular optimisation methods deep learning. based this, develop relativistic stochastic gradient descent taking zero-temperature limit relativistic stochastic gradient hamiltonian monte carlo. experiments show relativistic algorithms perform better classical newtonian variants adam.",19 "relative upper confidence bound k-armed dueling bandit problem. paper proposes new method k-armed dueling bandit problem, variation regular k-armed bandit problem offers relative feedback pairs arms. approach extends upper confidence bound algorithm relative setting using estimates pairwise probabilities select promising arm applying upper confidence bound winner benchmark. prove finite-time regret bound order o(log t). addition, empirical results using real data information retrieval application show greatly outperforms state art.",4 "compressive optical deflectometric tomography: constrained total-variation minimization approach. optical deflectometric tomography (odt) provides accurate characterization transparent materials whose complex surfaces present real challenge manufacture control. odt, refractive index map (rim) transparent object reconstructed measuring light deflection multiple orientations. show imaging modality made ""compressive"", i.e., correct rim reconstruction achievable far less observations required traditional filtered back projection (fbp) methods. assuming cartoon-shape rim model, reconstruction driven minimizing map total-variation fidelity constraint available observations. moreover, two realistic assumptions added improve stability approach: map positivity frontier condition. numerically, method relies accurate odt sensing model primal-dual minimization scheme, including easily sensing operator proposed rim constraints. conclude paper demonstrating power method synthetic experimental data various compressive scenarios. particular, compressiveness stabilized odt problem demonstrated observing typical gain 20 db compared fbp 5% 360 incident light angles moderately noisy sensing.",4 "learning global features coreference resolution. compelling evidence coreference prediction would benefit modeling global information entity-clusters. yet, state-of-the-art performance achieved systems treating mention prediction independently, attribute inherent difficulty crafting informative cluster-level features. instead propose use recurrent neural networks (rnns) learn latent, global representations entity clusters directly mentions. show representations especially useful prediction pronominal mentions, incorporated end-to-end coreference system outperforms state art without requiring additional search.",4 "learning low dimensional convolutional neural networks high-resolution remote sensing image retrieval. learning powerful feature representations image retrieval always challenging task field remote sensing. traditional methods focus extracting low-level hand-crafted features time-consuming also tend achieve unsatisfactory performance due content complexity remote sensing images. paper, investigate extract deep feature representations based convolutional neural networks (cnn) high-resolution remote sensing image retrieval (hrrsir). end, two effective schemes proposed generate powerful feature representations hrrsir. first scheme, deep features extracted fully-connected convolutional layers pre-trained cnn models, respectively; second scheme, propose novel cnn architecture based conventional convolution layers three-layer perceptron. novel cnn model trained large remote sensing dataset learn low dimensional features. two schemes evaluated several public challenging datasets, results indicate proposed schemes particular novel cnn able achieve state-of-the-art performance.",4 "metric learning perspective svm: relation svm lmnn. support vector machines, svms, large margin nearest neighbor algorithm, lmnn, two popular learning algorithms quite different learning biases. paper bring unified view show much stronger relation commonly thought. analyze svms metric learning perspective cast metric learning problem, view helps us uncover relations two algorithms. show lmnn seen learning set local svm-like models quadratic space. along way inspired metric-based interpretation svm derive novel variant svms, epsilon-svm, lmnn even similar. give unified view lmnn different svm variants. finally provide preliminary experiments number benchmark datasets show epsilon-svm compares favorably respect lmnn svm.",4 "geometric primitive feature extraction - concepts, algorithms, applications. thesis presents important insights concepts related topic extraction geometric primitives edge contours digital images. three specific problems related topic studied, viz., polygonal approximation digital curves, tangent estimation digital curves, ellipse fitting anddetection digital curves. problem polygonal approximation, two fundamental problems addressed. first, nature performance evaluation metrics relation local global fitting characteristics studied. second, explicit error bound error introduced digitizing continuous line segment derived used propose generic non-heuristic parameter independent framework used several dominant point detection methods. problem tangent estimation digital curves, simple method tangent estimation proposed. shown method definite upper bound error conic digital curves. shown method performs better almost (seventy two) existing tangent estimation methods conic well several non-conic digital curves. problem fitting ellipses digital curves, geometric distance minimization model considered. unconstrained, linear, non-iterative, numerically stable ellipse fitting method proposed shown proposed method better selectivity elliptic digital curves (high true positive low false positive) compared several ellipse fitting methods. problem detecting ellipses set digital curves, several innovative fast pre-processing, grouping, hypotheses evaluation concepts applicable digital curves proposed combined form ellipse detection method.",4 "accelerated block coordinate proximal gradients applications high dimensional statistics. nonconvex optimization problems arise different research fields arouse lots attention signal processing, statistics machine learning. work, explore accelerated proximal gradient method variants shown converge nonconvex context recently. show novel variant proposed here, exploits adaptive momentum block coordinate update specific update rules, improves performance broad class nonconvex problems. applications sparse linear regression regularizations like lasso, grouped lasso, capped $\ell_1$ scap, proposed scheme enjoys provable local linear convergence, experimental justification.",12 "local procrustes manifold embedding: measure embedding quality embedding algorithms. present procrustes measure, novel measure based procrustes rotation enables quantitative comparison output manifold-based embedding algorithms (such lle (roweis saul, 2000) isomap (tenenbaum et al, 2000)). measure also serves natural tool choosing dimension-reduction parameters. also present two novel dimension-reduction techniques attempt minimize suggested measure, compare results techniques results existing algorithms. finally, suggest simple iterative method used improve output existing algorithms.",19 "general algorithm deciding transportability experimental results. generalizing empirical findings new environments, settings, populations essential scientific explorations. article treats particular problem generalizability, called ""transportability"", defined license transfer information learned experimental studies different population, observational studies conducted. given set assumptions concerning commonalities differences two populations, pearl bareinboim (2011) derived sufficient conditions permit transfer take place. article summarizes findings supplements effective procedure deciding transportability feasible. establishes necessary sufficient condition deciding causal effects target population estimable statistical information available causal information transferred experiments. article provides complete algorithm computing transport formula, is, way combining observational experimental information synthesize bias-free estimate desired causal relation. finally, article examines differences transportability variants generalizability.",4 "convolutional neural networks joint object detection pose estimation: comparative study. paper study application convolutional neural networks jointly detecting objects depicted still images estimating 3d pose. identify different feature representations oriented objects, energies lead network learn representations. choice representation crucial since pose object natural, continuous structure category discrete variable. evaluate different approaches joint object detection pose estimation task pascal3d+ benchmark using average viewpoint precision. show classification approach discretized viewpoints achieves state-of-the-art performance joint object detection pose estimation, significantly outperforms existing baselines benchmark.",4 "maximum production transmission messages rate service discovery protocols. minimizing number dropped user datagram protocol (udp) messages network regarded challenge researchers. issue represents serious problems many protocols particularly depend sending messages part strategy, us service discovery protocols. paper proposes evaluates algorithm predict minimum period time required two consecutive messages suggests minimum queue sizes routers, manage traffic minimise number dropped messages caused either congestion queue overflow together. algorithm applied universal plug play (upnp) protocol using ns2 simulator. tested routers connected two configurations; centralized de centralized. message length bandwidth links among routers taken consideration. result shows better improvement number dropped messages `among routers.",4 "innateness, alphazero, artificial intelligence. concept innateness rarely discussed context artificial intelligence. discussed, hinted at, often context trying reduce amount innate machinery given system. paper, consider test case recent series papers silver et al (silver et al., 2017a) alphago successors presented argument ""even challenging domains: possible train superhuman level, without human examples guidance"", ""starting tabula rasa."" argue claims overstated, multiple reasons. close arguing artificial intelligence needs greater attention innateness, point proposals innateness might look like.",4 "generalization without systematicity: compositional skills sequence-to-sequence recurrent networks. humans understand produce new utterances effortlessly, thanks compositional skills. person learns meaning new verb ""dax,"" immediately understand meaning ""dax twice"" ""sing dax."" paper, introduce scan domain, consisting set simple compositional navigation commands paired corresponding action sequences. test zero-shot generalization capabilities variety recurrent neural networks (rnns) trained scan sequence-to-sequence methods. find rnns make successful zero-shot generalizations differences training test commands small, apply ""mix-and-match"" strategies solve task. however, generalization requires systematic compositional skills (as ""dax"" example above), rnns fail spectacularly. conclude proof-of-concept experiment neural machine translation, suggesting lack systematicity might partially responsible neural networks' notorious training data thirst.",4 "locality low-dimensions prediction natural experience fmri. functional magnetic resonance imaging (fmri) provides dynamical access complex functioning human brain, detailing hemodynamic activity thousands voxels hundreds sequential time points. one approach towards illuminating connection fmri cognitive function decoding; time series voxel activities combine provide information internal external experience? seek models fmri decoding balanced simplicity interpretation effectiveness prediction. use signals subject immersed virtual reality compare global local methods prediction applying linear nonlinear techniques dimensionality reduction. find prediction complex stimuli remarkably low-dimensional, saturating less 100 features. particular, build effective models based decorrelated components cognitive activity classically-defined brodmann areas. stimuli, top predictive areas surprisingly transparent, including wernicke's area verbal instructions, visual cortex facial body features, visual-temporal regions velocity. direct sensory experience resulted robust predictions, highest correlation ($c \sim 0.8$) predicted experienced time series verbal instructions. techniques based non-linear dimensionality reduction (laplacian eigenmaps) performed similarly. interpretability relative simplicity approach provides conceptual basis upon build sophisticated techniques fmri decoding offers window cognitive function dynamic, natural experience.",16 "convex optimization big data. article reviews recent advances convex optimization algorithms big data, aim reduce computational, storage, communications bottlenecks. provide overview emerging field, describe contemporary approximation techniques like first-order methods randomization scalability, survey important role parallel distributed computation. new big data algorithms based surprisingly simple principles attain staggering accelerations even classical problems.",12 "data mining concept ""end world"" twitter microblogs. paper describes analysis quantitative characteristics frequent sets association rules posts twitter microblogs, related discussion ""end world"", allegedly predicted december 21, 2012 due mayan calendar. discovered frequent sets association rules characterize semantic relations concepts analyzed subjects.the support fequent sets reaches global maximum expected event time delay. frequent sets may considered predictive markers characterize significance expected events blogosphere users. shown time dynamics confidence revealed association rules also predictive characteristics. exceeding certain threshold, may signal corresponding reaction society time interval maximum probable coming event.",4 "improving performance english-tamil statistical machine translation system using source-side pre-processing. machine translation one major oldest active research area natural language processing. currently, statistical machine translation (smt) dominates machine translation research. statistical machine translation approach machine translation uses models learn translation patterns directly data, generalize translate new unseen text. smt approach largely language independent, i.e. models applied language pair. statistical machine translation (smt) attempts generate translations using statistical methods based bilingual text corpora. corpora available, excellent results attained translating similar texts, corpora still available many language pairs. statistical machine translation systems, general, difficulty handling morphology source target side especially morphologically rich languages. errors morphology syntax target language severe consequences meaning sentence. change grammatical function words understanding sentence incorrect tense information verb. baseline smt also known phrase based statistical machine translation (pbsmt) system use linguistic information operates surface word form. recent researches shown adding linguistic information helps improve accuracy translation less amount bilingual corpora. adding linguistic information done using factored statistical machine translation system pre-processing steps. paper investigates english side pre-processing used improve accuracy english-tamil smt system.",4 "semi-structured data extraction modelling: wia project. last decades, amount data kinds available electronically increased dramatically. data accessible range interfaces including web browsers, database query languages, application-specific interfaces, built top number different data exchange formats. data span un-structured highly structured data. often, structure even structure implicit, rigid regular found standard database systems. spreadsheet documents prototypical respect. spreadsheets lightweight technology able supply companies easy build business management business intelligence applications, business people largely adopt spreadsheets smart vehicles data files generation sharing. actually, spreadsheets grow complexity (e.g., use product development plans quoting), arrangement, maintenance, analysis appear knowledge-driven activity. algorithmic approach problem automatic data structure extraction spreadsheet documents (i.e., grid-structured free topological-related data) emerges wia project: worksheets intelligent analyser. wia-algorithm shows provide description spreadsheet contents terms higher level abstractions conceptualisations. particular, wia-algorithm target extraction i) calculus work-flow implemented spreadsheets formulas ii) logical role played data take part calculus. aim resulting conceptualisations provide spreadsheets abstract representations useful model refinements optimizations evolutionary algorithms computations.",4 "characterizing maximum parameter total-variation denoising pseudo-inverse divergence. focus maximum regularization parameter anisotropic total-variation denoising. corresponds minimum value regularization parameter solution remains constant. value well know lasso, critical value investigated details total-variation. though, importance tuning regularization parameter allows fixing upper-bound grid optimal parameter sought. establish closed form expression one-dimensional case, well upper-bound two-dimensional case, appears reasonably tight practice. problem directly linked computation pseudo-inverse divergence, quickly obtained performing convolutions fourier domain.",19 "incremental maintenance association rules support threshold change. maintenance association rules interesting problem. several incremental maintenance algorithms proposed since work (cheung et al, 1996). majority algorithms maintain rule bases assuming support threshold change. paper, present incremental maintenance algorithm support threshold change. solution allows user maintain rule base support threshold.",4 "cognitive mind-map framework foster trust. explorative mind-map dynamic framework, emerges automatically input, gets. unlike verificative modeling system existing (human) thoughts placed connected together. regard, explorative mind-maps change size continuously, adaptive connectionist cells inside; mind-maps process data input incrementally offer lots possibilities interact user appropriate communication interface. respect cognitive motivated situation like conversation partners, mind-maps become interesting able process stimulating signals whenever occur. signals close understanding world, conversational partner becomes automatically trustful signals less match knowledge scheme. (position) paper, therefore motivate explorative mind-maps cognitive engine propose decision support engine foster trust.",4 "context aware nonnegative matrix factorization clustering. article propose method refine clustering results obtained nonnegative matrix factorization (nmf) technique, imposing consistency constraints final labeling data. research community focused effort initialization optimization part method, without paying attention final cluster assignments. propose game theoretic framework object clustered represented player, choose cluster membership. information obtained nmf used initialize strategy space players weighted graph used model interactions among players. interactions allow players choose cluster coherent clusters chosen similar players, property guaranteed nmf, since produces soft clustering data. results common benchmarks show model able improve performances many nmf formulations.",4 "adaptive admm spectral penalty parameter selection. alternating direction method multipliers (admm) versatile tool solving wide range constrained optimization problems, differentiable non-differentiable objective functions. unfortunately, performance highly sensitive penalty parameter, makes admm often unreliable hard automate non-expert user. tackle weakness admm proposing method adaptively tune penalty parameters achieve fast convergence. resulting adaptive admm (aadmm) algorithm, inspired successful barzilai-borwein spectral method gradient descent, yields fast convergence relative insensitivity initial stepsize problem scaling.",4 "a-ward_p\b{eta}: effective hierarchical clustering using minkowski metric fast k -means initialisation. paper make two novel contributions hierarchical clustering. first, introduce anomalous pattern initialisation method hierarchical clustering algorithms, called a-ward, capable substantially reducing time take converge. method generates initial partition sufficiently large number clusters. allows cluster merging process start partition rather trivial partition composed solely singletons. second contribution extension ward ward p algorithms situation feature weight exponent differ exponent minkowski distance. new method, called a-ward p\b{eta} , able generate much wider variety clustering solutions. also demonstrate parameters estimated reasonably well using cluster validity index. perform numerous experiments using data sets two types noise, insertion noise features blurring within-cluster values features. experiments allow us conclude: (i) anomalous pattern initialisation method indeed reduce time hierarchical clustering algorithm takes complete, without negatively impacting cluster recovery ability; (ii) a-ward p\b{eta} provides better cluster recovery ward ward p.",4 "méthodes pour la représentation informatisée de données lexicales / methoden der speicherung lexikalischer daten. recent years, new developments area lexicography altered management, processing publishing lexicographical data, also created new types products electronic dictionaries thesauri. expand range possible uses lexical data support users flexibility, instance assisting human translation. article, give short easy-to-understand introduction problematic nature storage, display interpretation lexical data. describe main methods specifications used build represent lexical data. paper targeted following groups people: linguists, lexicographers, specialists, computer linguists others wish learn modelling, representation visualization lexical knowledge. paper written two languages: french german.",4 "lower bound analysis population-based evolutionary algorithms pseudo-boolean functions. evolutionary algorithms (eas) population-based general-purpose optimization algorithms, successfully applied various real-world optimization tasks. however, previous theoretical studies often employ eas parent offspring population focus specific problems. furthermore, often show upper bounds running time, lower bounds also necessary get complete understanding algorithm. paper, analyze running time ($\mu$+$\lambda$)-ea (a general population-based ea mutation only) class pseudo-boolean functions unique global optimum. applying recently proposed switch analysis approach, prove lower bound $\omega(n \ln n+ \mu + \lambda n\ln\ln n/ \ln n)$ first time. particularly two widely-studied problems, onemax leadingones, derived lower bound discloses ($\mu$+$\lambda$)-ea strictly slower (1+1)-ea population size $\mu$ $\lambda$ moderate order. results imply increase population size, usually desired practice, bears risk increasing lower bound running time thus carefully considered.",4 "time stretch inspired computational imaging. show dispersive propagation light followed phase detection properties exploited extracting features waveforms. discovery spearheading development new class physics-inspired algorithms feature extraction digital images unique properties superior dynamic range compared conventional algorithms. certain cases, algorithms potential energy efficient scalable substitute synthetically fashioned computational techniques practice today.",4 numerical weather prediction stochastic modeling: objective criterion choice global radiation forecasting. numerous methods exist developed global radiation forecasting. two popular types numerical weather predictions (nwp) predictions using stochastic approaches. propose compute parameter noted constructed part mutual information quantity measures mutual dependence two variables. calculated objective establish relevant method nwp stochastic models concerning current problem.,19 "dirichlet fragmentation processes. tree structures ubiquitous data across many domains, many datasets naturally modelled unobserved tree structures. paper, first review theory random fragmentation processes [bertoin, 2006], number existing methods modelling trees, including popular nested chinese restaurant process (ncrp). define general class probability distributions trees: dirichlet fragmentation process (dfp) novel combination theory dirichlet processes random fragmentation processes. dfp presents stick-breaking construction, relates ncrp way dirichlet process relates chinese restaurant process. furthermore, develop novel hierarchical mixture model dfp, empirically compare new model similar models machine learning. experiments show dfp mixture model convincingly better existing state-of-the-art approaches hierarchical clustering density modelling.",19 "gaussian processes data-efficient learning robotics control. autonomous learning promising direction control robotics decade since data-driven learning allows reduce amount engineering knowledge, otherwise required. however, autonomous reinforcement learning (rl) approaches typically require many interactions system learn controllers, practical limitation real systems, robots, many interactions impractical time consuming. address problem, current learning approaches typically require task-specific knowledge form expert demonstrations, realistic simulators, pre-shaped policies, specific knowledge underlying dynamics. article, follow different approach speed learning extracting information data. particular, learn probabilistic, non-parametric gaussian process transition model system. explicitly incorporating model uncertainty long-term planning controller learning approach reduces effects model errors, key problem model-based learning. compared state-of-the art rl model-based policy search method achieves unprecedented speed learning. demonstrate applicability autonomous learning real robot control tasks.",19 "geometric decision tree. paper present new algorithm learning oblique decision trees. current decision tree algorithms rely impurity measures assess goodness hyperplanes node learning decision tree top-down fashion. impurity measures properly capture geometric structures data. motivated this, algorithm uses strategy assess hyperplanes way geometric structure data taken account. node decision tree, find clustering hyperplanes classes use angle bisectors split rule node. show empirical studies idea leads small decision trees better performance. also present analysis show angle bisectors clustering hyperplanes use split rules node, solutions interesting optimization problem hence argue principled method learning decision tree.",4 "generalization error bounds probabilistic guarantee sgd nonconvex optimization. success deep learning led rising interest generalization property stochastic gradient descent (sgd) method, stability one popular approach study it. existing works based stability studied nonconvex loss functions, considered generalization error sgd expectation. paper, establish various generalization error bounds probabilistic guarantee sgd. specifically, general nonconvex loss functions gradient dominant loss functions, characterize on-average stability iterates generated sgd terms on-average variance stochastic gradients. characterization leads improved bounds generalization error sgd. study regularized risk minimization problem strongly convex regularizers, obtain improved generalization error bounds proximal sgd. strongly convex regularizers, establish generalization error bounds nonconvex loss functions proximal sgd high-probability guarantee, i.e., exponential concentration probability.",19 "improving vision-based self-positioning intelligent transportation systems via integrated lane vehicle detection. traffic congestion widespread problem. dynamic traffic routing systems congestion pricing getting importance recent research. lane prediction vehicle density estimation important component systems. introduce novel problem vehicle self-positioning involves predicting number lanes road vehicle's position lanes using videos captured dashboard camera. propose integrated closed-loop approach use presence vehicles aid task self-positioning vice-versa. incorporate multiple factors high-level semantic knowledge solution, formulate problem bayesian framework. framework, number lanes, vehicle's position lanes presence vehicles considered parameters. also propose bounding box selection scheme reduce number false detections increase computational efficiency. show number box proposals decreases factor 6 using selection approach. also results large reduction number false detections. entire approach tested real-world videos found give acceptable results.",4 "relaxation graph coloring satisfiability problems. using t=0 monte carlo simulation, study relaxation graph coloring (k-col) satisfiability (k-sat), two hard problems recently shown possess phase transition solvability parameter varied. change exponentially fast power law relaxation, transition freezing behavior found. changes take place smaller values parameter solvability transition. results coloring problem colorable clustered graphs fraction persistent spins satisfiability also presented.",3 "early human visual system compete deep neural networks?. study compare human visual system state-of-the-art deep neural networks classification distorted images. different previous works, limit display time 100ms test early mechanisms human visual system, without allowing time eye movements higher level processes. findings show human visual system still outperforms modern deep neural networks blurry noisy images. findings motivate future research developing robust deep networks.",4 "using fast weights attend recent past. recently, research artificial neural networks largely restricted systems two types variable: neural activities represent current recent input weights learn capture regularities among inputs, outputs payoffs. good reason restriction. synapses dynamics many different time-scales suggests artificial neural networks might benefit variables change slower activities much faster standard weights. ""fast weights"" used store temporary memories recent past provide neurally plausible way implementing type attention past recently proved helpful sequence-to-sequence models. using fast weights avoid need store copies neural activity patterns.",19 "super-resolution wavelet-encoded images. multiview super-resolution image reconstruction (srir) often cast resampling problem merging non-redundant data multiple low-resolution (lr) images finer high-resolution (hr) grid, inverting effect camera point spread function (psf). one main problem multiview methods resampling nonuniform samples (provided lr images) inversion psf highly nonlinear ill-posed problems. non-linearity ill-posedness typically overcome linearization regularization, often iterative optimization process, essentially trade information (i.e. high frequency) want recover. propose novel point view multiview srir: unlike existing multiview methods reconstruct entire spectrum hr image multiple given lr images, derive explicit expressions show high-frequency spectra unknown hr image related spectra lr images. therefore, taking lr images reference represent low-frequency spectra hr image, one reconstruct super-resolution image focusing reconstruction high-frequency spectra. much like single-image methods, extrapolate spectrum one image, except rely information provided views, rather prior constraints single-image methods (which may accurate source information). made possible deriving applying explicit closed-form expressions define local high frequency information aim recover reference high resolution image related local low frequency information sequence views. results comparisons recently published state-of-the-art methods show superiority proposed solution.",4 "learning without concentration. obtain sharp bounds performance empirical risk minimization performed convex class respect squared loss, without assuming class members target bounded functions rapidly decaying tails. rather resorting concentration-based argument, method used relies `small-ball' assumption thus holds classes consisting heavy-tailed functions heavy-tailed targets. resulting estimates scale correctly `noise level' problem, applied classical, bounded scenario, always improve known bounds.",4 "general framework recognition online handwritten graphics. propose new framework recognition online handwritten graphics. three main features framework ability treat symbol structural level information integrated way, flexibility respect different families graphics, means control tradeoff recognition effectiveness computational cost. model graphic labeled graph generated graph grammar. non-terminal vertices represent subcomponents, terminal vertices represent symbols, edges represent relations subcomponents symbols. model recognition problem graph parsing problem: given input stroke set, search parse tree represents best interpretation input. graph parsing algorithm generates multiple interpretations (consistent grammar) extract optimal interpretation according cost function takes consideration likelihood scores symbols structures. parsing algorithm consists recursively partitioning stroke set according structures defined grammar impose constraints present previous works (e.g. stroke ordering). avoiding constraints thanks powerful representativeness graphs, approach adapted recognition different graphic notations. show applications recognition mathematical expressions flowcharts. experimentation shows method obtains state-of-the-art accuracy applications.",4 rule-based query answering method knowledge base economic crimes. present description phd thesis aims propose rule-based query answering method relational data. approach use additional knowledge represented set rules describes source data concept (ontological) level. queries posed terms abstract level. present two methods. first one uses hybrid reasoning second one exploits forward chaining. two methods demonstrated prototypical implementation system coupled jess engine. tests performed knowledge base selected economic crimes: fraudulent disbursement money laundering.,4 "autonomous quantum perceptron neural network. recently, rapid development technology, lot applications require achieve low-cost learning. however computational power classical artificial neural networks, capable provide low-cost learning. contrast, quantum neural networks may representing good computational alternate classical neural network approaches, based computational power quantum bit (qubit) classical bit. paper present new computational approach quantum perceptron neural network achieve learning low-cost computation. proposed approach one neuron construct self-adaptive activation operators capable accomplish learning process limited number iterations and, thereby, reduce overall computational cost. proposed approach capable construct set activation operators applied widely quantum classical applications overcome linearity limitation classical perceptron. computational power proposed approach illustrated via solving variety problems promising comparable results given.",4 "sketching large-scale learning mixture models. learning parameters voluminous data prohibitive terms memory computational requirements. propose ""compressive learning"" framework estimate model parameters sketch training data. sketch collection generalized moments underlying probability distribution data. computed single pass training set, easily computable streams distributed datasets. proposed framework shares similarities compressive sensing, aims drastically reducing dimension high-dimensional signals preserving ability reconstruct them. perform estimation task, derive iterative algorithm analogous sparse reconstruction algorithms context linear inverse problems. exemplify framework compressive estimation gaussian mixture model (gmm), providing heuristics choice sketching procedure theoretical guarantees reconstruction. experimentally show synthetic data proposed algorithm yields results comparable classical expectation-maximization (em) technique requiring significantly less memory fewer computations number database elements large. demonstrate potential approach real large-scale data (over 10 8 training samples) task model-based speaker verification. finally, draw connections proposed framework approximate hilbert space embedding probability distributions using random features. show proposed sketching operator seen innovative method design translation-invariant kernels adapted analysis gmms. also use theoretical framework derive information preservation guarantees, spirit infinite-dimensional compressive sensing.",4 "one-pass person re-identification sketch online discriminant analysis. person re-identification (re-id) match people across disjoint camera views multi-camera system, re-id important technology applied smart city recent years. however, majority existing person re-id methods designed processing sequential data online way. ignores real-world scenario person images detected multi-cameras system coming sequentially. work discussing online re-id, require considerable storage passed data samples ever observed, could unrealistic processing data large camera network. work, present onepass person re-id model adapts re-id model based newly observed data passed data directly used update. specifically, develop sketch online discriminant analysis (soda) embedding sketch processing fisher discriminant analysis (fda). soda efficiently keep main data variations passed samples low rank matrix processing sequential data samples, estimate approximate within-class variance (i.e. within-class covariance matrix) sketch data information. provide theoretical analysis effect estimated approximate within-class covariance matrix. particular, derive upper lower bounds fisher discriminant score (i.e. quotient between-class variation within-class variation feature transformation) order investigate optimal feature transformation learned soda sequentially approximates offline fda learned observed data. extensive experimental results shown effectiveness soda empirically support theoretical analysis.",4 "image enhancement statistical estimation. contrast enhancement important area research image analysis. decade, researcher worked domain develop efficient adequate algorithm. proposed method enhance contrast image using binarization method help maximum likelihood estimation (mle). paper aims enhance image contrast bimodal multi-modal images. proposed methodology use collect mathematical information retrieves image. paper, using binarization method generates desired histogram separating image nodes. generates enhanced image using histogram specification binarization method. proposed method showed improvement image contrast enhancement compare image.",4 "os* algorithm: joint approach exact optimization sampling. current sampling algorithms high-dimensional distributions based mcmc techniques approximate sense valid asymptotically. rejection sampling, hand, produces valid samples, unrealistically slow high-dimension spaces. os* algorithm propose unified approach exact optimization sampling, based incremental refinements functional upper bound, combines ideas adaptive rejection sampling a* optimization search. show choice refinement done way ensures tractability high-dimension spaces, present first experiments two different settings: inference high-order hmms large discrete graphical models.",4 "lego: learning edge geometry watching videos. learning estimate 3d geometry single image watching unlabeled videos via deep convolutional network attracting significant attention. paper, introduce ""3d as-smooth-as-possible (3d-asap)"" priori inside pipeline, enables joint estimation edges 3d scene, yielding results significant improvement accuracy fine detailed structures. specifically, define 3d-asap priori requiring two points recovered 3d image lie existing planar surface cues provided. design unsupervised framework learns edges geometry (depth, normal) (lego). predicted edges embedded depth surface normal smoothness terms, pixels without edges in-between constrained satisfy priori. framework, predicted depths, normals edges forced consistent time. conduct experiments kitti evaluate estimated geometry cityscapes perform edge evaluation. show tasks, i.e.depth, normal edge, algorithm vastly outperforms state-of-the-art (sota) algorithms, demonstrating benefits approach.",4 "zipf's law word frequencies: word forms versus lemmas long texts. zipf's law fundamental paradigm statistics written spoken natural language well communication systems. raise question elementary units zipf's law hold natural way, studying validity plain word forms corresponding lemma forms. order homogeneous sources possible, analyze longest literary texts ever written, comprising four different languages, different levels morphological complexity. cases zipf's law fulfilled, sense power-law distribution word lemma frequencies valid several orders magnitude. investigate extent word-lemma transformation preserves two parameters zipf's law: exponent low-frequency cut-off. able demonstrate strict invariance tail, texts exponents deviate significantly, conclude exponents similar, despite remarkable transformation going words lemmas represents, considerably affecting ranges frequencies. contrast, low-frequency cut-offs less stable.",15 "context-aware generative adversarial privacy. preserving utility published datasets simultaneously providing provable privacy guarantees well-known challenge. one hand, context-free privacy solutions, differential privacy, provide strong privacy guarantees, often lead significant reduction utility. hand, context-aware privacy solutions, information theoretic privacy, achieve improved privacy-utility tradeoff, assume data holder access dataset statistics. circumvent limitations introducing novel context-aware privacy framework called generative adversarial privacy (gap). gap leverages recent advancements generative adversarial networks (gans) allow data holder learn privatization schemes dataset itself. gap, learning privacy mechanism formulated constrained minimax game two players: privatizer sanitizes dataset way limits risk inference attacks individuals' private variables, adversary tries infer private variables sanitized dataset. evaluate gap's performance, investigate two simple (yet canonical) statistical dataset models: (a) binary data model, (b) binary gaussian mixture model. models, derive game-theoretically optimal minimax privacy mechanisms, show privacy mechanisms learned data (in generative adversarial fashion) match theoretically optimal ones. demonstrates framework easily applied practice, even absence dataset statistics.",4 "crowdsourcing ground truth medical relation extraction. cognitive computing systems require human labeled data evaluation, often training. standard practice used gathering data minimizes disagreement annotators, found results data fails account ambiguity inherent language. proposed crowdtruth method collecting ground truth crowdsourcing, reconsiders role people machine learning based observation disagreement annotators provides useful signal phenomena ambiguity text. report using method build annotated data set medical relation extraction $cause$ $treat$ relations, data performed supervised training experiment. demonstrate modeling ambiguity, labeled data gathered crowd workers (1) reach level quality domain experts task reducing cost, (2) provide better training data scale distant supervision. propose validate new weighted measures precision, recall, f-measure, account ambiguity human machine performance task.",4 "churn prediction mobile social games: towards complete assessment using survival ensembles. reducing user attrition, i.e. churn, broad challenge faced several industries. mobile social games, decreasing churn decisive increase player retention rise revenues. churn prediction models allow understand player loyalty anticipate stop playing game. thanks predictions, several initiatives taken retain players likely churn. survival analysis focuses predicting time occurrence certain event, churn case. classical methods, like regressions, could applied players left game. challenge arises datasets incomplete churning information players, still connect game. called censored data problem nature churn. censoring commonly dealt survival analysis techniques, due inflexibility survival statistical algorithms, accuracy achieved often poor. contrast, novel ensemble learning techniques, increasingly popular variety scientific fields, provide high-class prediction results. work, develop, first time social games domain, survival ensemble model provides comprehensive analysis together accurate prediction churn. player, predict probability churning function time, permits distinguish various levels loyalty profiles. additionally, assess risk factors explain predicted player survival times. results show churn prediction survival ensembles significantly improves accuracy robustness traditional analyses, like cox regression.",19 "design intelligent agents based system commodity market simulation jade. market potato commodity industry scale usage engaging several types actors. farmers, middlemen, industries. multi-agent system built simulate actors agent entities, based manually given parameters within simulation scenario file. type agents fuzzy logic representing actual actors' knowledge, used interpreting values take appropriated decision simulation. system simulate market activities programmed behaviors produce results spreadsheet chart graph files. results consist agent's yearly finance commodity data. system also predict next value outputs.",4 "efficient algorithm learning semi-bandit feedback. consider problem online combinatorial optimization semi-bandit feedback. goal learner sequentially select actions combinatorial decision set minimize cumulative loss. propose learning algorithm problem based combining follow-the-perturbed-leader (fpl) prediction method novel loss estimation procedure called geometric resampling (gr). contrary previous solutions, resulting algorithm efficiently implemented decision set efficient offline combinatorial optimization possible all. assuming elements decision set described d-dimensional binary vectors non-zero entries, show expected regret algorithm rounds o(m sqrt(dt log d)). side result, also improve best known regret bounds fpl full information setting o(m^(3/2) sqrt(t log d)), gaining factor sqrt(d/m) previous bounds algorithm.",4 "hybrid model solving multi-objective problems using evolutionary algorithm tabu search. paper presents new multi-objective hybrid model makes cooperation strength research neighborhood methods presented tabu search (ts) important exploration capacity evolutionary algorithm. model implemented tested benchmark functions (zdt1, zdt2, zdt3), using network computers.",4 "combining models approximation partial learning. gold's framework inductive inference, model partial learning requires learner output exactly one correct index target object target object infinitely often. since infinitely many learner's hypotheses may incorrect, obvious whether partial learner modifed ""approximate"" target object. fulk jain (approximate inference scientific method. information computation 114(2):179--191, 1994) introduced model approximate learning recursive functions. present work extends research solves open problem fulk jain showing learner approximates partially identifies every recursive function outputting sequence hypotheses which, addition, also almost finite variants target function. subsequent study dedicated question findings generalise learning r.e. languages positive data. three variants approximate learning introduced investigated respect question whether combined partial learning. following line fulk jain's research, investigations provide conditions partial language learners eventually output finite variants target language. combinabilities partial learning criteria also briefly studied.",4 "corpus annotation parser evaluation. describe recently developed corpus annotation scheme evaluating parsers avoids shortcomings current methods. scheme encodes grammatical relations heads dependents, used mark new public-domain corpus naturally occurring english text. show corpus used evaluate accuracy robust parser, relate corpus extant resources.",4 "linearly parameterized bandits. consider bandit problems involving large (possibly infinite) collection arms, expected reward arm linear function $r$-dimensional random vector $\mathbf{z} \in \mathbb{r}^r$, $r \geq 2$. objective minimize cumulative regret bayes risk. set arms corresponds unit sphere, prove regret bayes risk order $\theta(r \sqrt{t})$, establishing lower bound arbitrary policy, showing matching upper bound obtained policy alternates exploration exploitation phases. phase-based policy also shown effective set arms satisfies strong convexity condition. case general set arms, describe near-optimal policy whose regret bayes risk admit upper bounds form $o(r \sqrt{t} \log^{3/2} t)$.",4 "iterative closest point method measuring level similarity 3d log scans wood industry. canadian's lumber industry, simulators used predict lumbers resulting sawing log given sawmill. giving log several logs' 3d scans input, simulators perform real-time job predict lumbers. simulators, however, tend slow processing large volume wood. thus explore alternative approximation techniques based iterative closest point (icp) algorithm identify already processed log unseen log resembles most. main benefit icp approach easily handle 3d scans variable number points. compare icp-based nearest neighbor predictor, predictors built using machine learning algorithms k-nearest-neighbor (knn) random forest (rf). implemented icp-based predictor enabled us identify key points using 3d scans directly distance calculation. long-term goal ongoing research integrated icp distance calculations machine learning.",4 "deep motion features visual tracking. robust visual tracking challenging computer vision problem, many real-world applications. existing approaches employ hand-crafted appearance features, hog color names. recently, deep rgb features extracted convolutional neural networks successfully applied tracking. despite success, features capture appearance information. hand, motion cues provide discriminative complementary information improve tracking performance. contrary visual tracking, deep motion features successfully applied action recognition video classification tasks. typically, motion features learned training cnn optical flow images extracted large amounts labeled videos. paper presents investigation impact deep motion features tracking-by-detection framework. show hand-crafted, deep rgb, deep motion features contain complementary information. best knowledge, first propose fusing appearance information deep motion features visual tracking. comprehensive experiments clearly suggest fusion approach deep motion features outperforms standard methods relying appearance information alone.",4 "causal inference multivariate mixed-type data. given data joint distribution two random variables $x$ $y$, consider problem inferring likely causal direction $x$ $y$. particular, consider general case $x$ $y$ may univariate multivariate, mixed data types. take information theoretic approach, based kolmogorov complexity, follows first describing data cause effect given cause shorter reverse direction. ideal score computable, approximated minimum description length (mdl) principle. based mdl, propose two scores, one $x$ $y$ single data type, one mixed-type. model dependencies $x$ $y$ using classification regression trees. inferring optimal model np-hard, propose crack, fast greedy algorithm determine likely causal direction directly data. empirical evaluation wide range data shows crack reliably, high accuracy, infers correct causal direction univariate multivariate cause-effect pairs single mixed-type data.",19 "scout-it: interior tomography using modified scout acquisition. global scout views previously used reduce interior reconstruction artifacts high-resolution micro-ct c-arm systems. however methods cannot directly used all-important domain clinical ct. ct scan truncated, scout views also truncated. however many cases truncation clinical ct involve partial truncation, anterio-posterior (ap) scout truncated, medio-lateral (ml) scout non-truncated. paper, show cases partially truncated ct scans, modified configuration may used acquire non-truncated ap scout view, ultimately allow highly accurate interior reconstruction.",16 "collaborative receptive field learning. challenge object categorization images largely due arbitrary translations scales foreground objects. attack difficulty, propose new approach called collaborative receptive field learning extract specific receptive fields (rf's) regions multiple images, selected rf's supposed focus foreground objects common category. end, solve problem maximizing submodular function similarity graph constructed pool rf candidates. however, measuring pairwise distance rf's building similarity graph nontrivial problem. hence, introduce similarity metric called pyramid-error distance (ped) measure pairwise distances summing pyramid-like matching errors set low-level features. besides, consistent proposed ped, construct simple nonparametric classifier classification. experimental results show method effectively discovers foreground objects images, improves classification performance.",4 "optimizing recurrent neural networks architectures time constraints. recurrent neural network (rnn)'s architecture key factor influencing performance. propose algorithms optimize hidden sizes running time constraint. convert discrete optimization subset selection problem. novel transformations, objective function becomes submodular constraint becomes supermodular. greedy algorithm bounds suggested solve transformed problem. show transformations influence bounds. speed optimization, surrogate functions proposed balance exploration exploitation. experiments show algorithms find accurate models faster models manually tuned state-of-the-art random search. also compare popular rnn architectures using algorithms.",19 "production probabilistic entropy structure/action contingency relations. luhmann (1984) defined society communication system structurally coupled to, aggregate of, human action systems. communication system considered self-organizing (""autopoietic""), human actors. communication systems studied using shannon's (1948) mathematical theory communication. update network action one local nodes well-known problem artificial intelligence (pearl 1988). combining various theories, general algorithm probabilistic structure/action contingency derived. consequences contingency system, consequences histories, stabilization side counterbalancing mechanisms discussed, mathematical theoretical terms. empirical example elaborated.",4 "multilabel classification ranking partial feedback. present novel multilabel/ranking algorithm working partial information settings. algorithm based 2nd-order descent methods, relies upper-confidence bounds trade-off exploration exploitation. analyze algorithm partial adversarial setting, covariates adversarial, multilabel probabilities ruled (generalized) linear models. show o(t^{1/2} log t) regret bounds, improve several ways existing results. test effectiveness upper-confidence scheme contrasting full-information baselines real-world multilabel datasets, often obtaining comparable performance.",4 "approximation algorithms $\ell_0$-low rank approximation. study $\ell_0$-low rank approximation problem, goal is, given $m \times n$ matrix $a$, output rank-$k$ matrix $a'$ $\|a'-a\|_0$ minimized. here, matrix $b$, $\|b\|_0$ denotes number non-zero entries. np-hard variant low rank approximation natural problems underlying metric, goal minimize number disagreeing data positions. provide approximation algorithms significantly improve running time approximation factor previous work. $k > 1$, show find, poly$(mn)$ time every $k$, rank $o(k \log(n/k))$ matrix $a'$ $\|a'-a\|_0 \leq o(k^2 \log(n/k)) \mathrm{opt}$. best knowledge, first algorithm provable guarantees $\ell_0$-low rank approximation problem $k > 1$, even bicriteria algorithms. well-studied case $k = 1$, give $(2+\epsilon)$-approximation {\it sublinear time}, impossible variants low rank approximation frobenius norm. strengthen well-studied case binary matrices obtain $(1+o(\psi))$-approximation sublinear time, $\psi = \mathrm{opt}/\lvert a\rvert_0$. small $\psi$, approximation factor $1+o(1)$.",4 "learning spatio-temporal representation pseudo-3d residual networks. convolutional neural networks (cnn) regarded powerful class models image recognition problems. nevertheless, trivial utilizing cnn learning spatio-temporal video representation. studies shown performing 3d convolutions rewarding approach capture spatial temporal dimensions videos. however, development deep 3d cnn scratch results expensive computational cost memory demand. valid question recycle off-the-shelf 2d networks 3d cnn. paper, devise multiple variants bottleneck building blocks residual learning framework simulating $3\times3\times3$ convolutions $1\times3\times3$ convolutional filters spatial domain (equivalent 2d cnn) plus $3\times1\times1$ convolutions construct temporal connections adjacent feature maps time. furthermore, propose new architecture, named pseudo-3d residual net (p3d resnet), exploits variants blocks composes different placement resnet, following philosophy enhancing structural diversity going deep could improve power neural networks. p3d resnet achieves clear improvements sports-1m video classification dataset 3d cnn frame-based 2d cnn 5.3% 1.8%, respectively. examine generalization performance video representation produced pre-trained p3d resnet five different benchmarks three different tasks, demonstrating superior performances several state-of-the-art techniques.",4 "kblrn : end-to-end learning knowledge base representations latent, relational, numerical features. present kblrn, framework end-to-end learning knowledge base representations latent, relational, numerical features. kblrn integrates feature types novel combination neural representation learning probabilistic product experts models. best knowledge, kblrn first approach learns representations knowledge bases integrating latent, relational, numerical features. show instances kblrn outperform existing methods range knowledge base completion tasks. contribute novel data sets enriching commonly used knowledge base completion benchmarks numerical features. made data sets available research. also investigate impact numerical features kb completion performance kblrn.",4 "semi-supervised model-based clustering controlled clusters leakage. paper, focus finding clusters partially categorized data sets. propose semi-supervised version gaussian mixture model, called c3l, retrieves natural subgroups given categories. contrast semi-supervised models, c3l parametrized user-defined leakage level, controls maximal inconsistency initial categorization resulting clustering. method implemented module practical expert systems detect clusters, combine expert knowledge true distribution data. moreover, used improving results less flexible clustering techniques, projection pursuit clustering. paper presents extensive theoretical analysis model fast algorithm efficient optimization. experimental results show c3l finds high quality clustering model, applied discovering meaningful groups partially classified data.",4 "aspects evolutionary design computers. paper examines four main types evolutionary design computers: evolutionary design optimisation, evolutionary art, evolutionary artificial life forms creative evolutionary design. definitions four areas provided. review current work areas given, examples types applications tackled. different properties requirements examined. descriptions typical representations evolutionary algorithms provided examples designs evolved using techniques shown. paper discusses boundaries areas beginning merge, resulting four new 'overlapping' types evolutionary design: integral evolutionary design, artificial life based evolutionary design, aesthetic evolutionary al aesthetic evolutionary design. finally, last part paper discusses common problems faced creators evolutionary design systems, including: interdependent elements designs, epistasis, constraint handling.",4 "flow-guided feature aggregation video object detection. extending state-of-the-art object detectors image video challenging. accuracy detection suffers degenerated object appearances videos, e.g., motion blur, video defocus, rare poses, etc. existing work attempts exploit temporal information box level, methods trained end-to-end. present flow-guided feature aggregation, accurate end-to-end learning framework video object detection. leverages temporal coherence feature level instead. improves per-frame features aggregation nearby features along motion paths, thus improves video recognition accuracy. method significantly improves upon strong single-frame baselines imagenet vid, especially challenging fast moving objects. framework principled, par best engineered systems winning imagenet vid challenges 2016, without additional bells-and-whistles. proposed method, together deep feature flow, powered winning entry imagenet vid challenges 2017. code available https://github.com/msracver/flow-guided-feature-aggregation.",4 "information directed sampling stochastic bandits graph feedback. consider stochastic multi-armed bandit problems graph feedback, decision maker allowed observe neighboring actions chosen action. allow graph structure vary time consider deterministic erd\h{o}s-r\'enyi random graph models. graph feedback model, first present novel analysis thompson sampling leads tighter performance bound existing work. next, propose new information directed sampling based policies graph-aware decision making. deterministic graph case, establish bayesian regret bound proposed policies scales clique cover number graph instead number actions. random graph case, provide bayesian regret bound proposed policies scales ratio number actions expected number observations per iteration. best knowledge, first analytical result stochastic bandits random graph feedback. finally, using numerical evaluations, demonstrate proposed ids policies outperform existing approaches, including adaptions upper confidence bound, $\epsilon$-greedy exp3 algorithms.",4 "optical flow-based 3d human motion estimation monocular video. present generative method estimate 3d human motion body shape monocular video. assumption starting initial pose optical flow constrains subsequent human motion, exploit flow find temporally coherent human poses motion sequence. estimate human motion minimizing difference computed flow fields output artificial flow renderer. single initialization step required estimate motion multiple frames. several regularization functions enhance robustness time. test scenarios demonstrate optical flow effectively regularizes under-constrained problem human shape motion estimation monocular video.",4 "learning belief networks domains recursively embedded pseudo independent submodels. pseudo independent (pi) model probabilistic domain model (pdm) proper subsets set collectively dependent variables display marginal independence. pi models cannot learned correctly many algorithms rely single link search. earlier work learning pi models suggested straightforward multi-link search algorithm. however, domain contains recursively embedded pi submodels, may escape detection algorithm. paper, propose improved algorithm ensures learning embedded pi submodels whose sizes upper bounded predetermined parameter. show improved learning capability increases complexity slightly beyond previous algorithm. performance new algorithm demonstrated experiment.",4 "delay-optimal power subcarrier allocation ofdma systems via stochastic approximation. paper, consider delay-optimal power subcarrier allocation design ofdma systems $n_f$ subcarriers, $k$ mobiles one base station. $k$ queues base station downlink traffic $k$ mobiles heterogeneous packet arrivals delay requirements. shall model problem $k$-dimensional infinite horizon average reward markov decision problem (mdp) control actions assumed function instantaneous channel state information (csi) well joint queue state information (qsi). problem challenging corresponds stochastic network utility maximization (num) problem general solution still unknown. propose {\em online stochastic value iteration} solution using {\em stochastic approximation}. proposed power control algorithm, function csi qsi, takes form multi-level water-filling. prove two mild conditions theorem 1 (one stepsize condition. condition accessibility markov chain, easily satisfied cases interested.), proposed solution converges optimal solution almost surely (with probability 1) proposed framework offers possible solution general stochastic num problem. exploiting birth-death structure queue dynamics, obtain reduced complexity decomposed solution linear $\mathcal{o}(kn_f)$ complexity $\mathcal{o}(k)$ memory requirement.",4 "sparse overcomplete word vector representations. current distributed representations words show little resemblance theories lexical semantics. former dense uninterpretable, latter largely based familiar, discrete classes (e.g., supersenses) relations (e.g., synonymy hypernymy). propose methods transform word vectors sparse (and optionally binary) vectors. resulting representations similar interpretable features typically used nlp, though discovered automatically raw corpora. vectors highly sparse, computationally easy work with. importantly, find outperform original vectors benchmark tasks.",4 "steerable graph laplacian application filtering image data-sets. recent years, improvements various scientific image acquisition techniques gave rise need adaptive processing methods aimed large data-sets corrupted noise deformations. work, consider data-sets images sampled underlying low-dimensional manifold (i.e. image-valued manifold), images obtained arbitrary planar rotations. derive mathematical framework processing data-sets, introduce graph laplacian-like operator, termed steerable graph laplacian (sgl), extends standard graph laplacian (gl) accounting (infinitely-many) planar rotations images. turns out, properly normalized sgl converges laplace-beltrami operator low-dimensional manifold, improved convergence rate compared gl. moreover, sgl admits eigenfunctions form fourier modes multiplied eigenvectors certain matrices. image data-sets corrupted noise, employ subset eigenfunctions ""filter"" data-set, essentially using images rotations simultaneously. demonstrate filtering framework de-noising simulated single-particle cryo-em image data-sets.",4 "deep learning identifying radiogenomic associations breast cancer. purpose: determine whether deep learning models distinguish breast cancer molecular subtypes based dynamic contrast-enhanced magnetic resonance imaging (dce-mri). materials methods: institutional review board-approved single-center study, analyzed dce-mr images 270 patients institution. lesions interest identified radiologists. task automatically determine whether tumor luminal subtype another subtype based mr image patches representing tumor. three different deep learning approaches used classify tumor according molecular subtypes: learning scratch tumor patches used training, transfer learning networks pre-trained natural images fine-tuned using tumor patches, off-the-shelf deep features features extracted neural networks trained natural images used classification support vector machine. network architectures utilized experiments googlenet, vgg, cifar. used 10-fold crossvalidation method validation area receiver operating characteristic (auc) measure performance. results: best auc performance distinguishing molecular subtypes 0.65 (95% ci:[0.57,0.71]) achieved off-the-shelf deep features approach. highest auc performance training scratch 0.58 (95% ci:[0.51,0.64]) best auc performance transfer learning 0.60 (95% ci:[0.52,0.65]) respectively. off-the-shelf approach, features extracted fully connected layer performed best. conclusion: deep learning may play role discovering radiogenomic associations breast cancer.",4 "traversing knowledge graphs vector space. path queries knowledge graph used answer compositional questions ""what languages spoken people living lisbon?"". however, knowledge graphs often missing facts (edges) disrupts path queries. recent models knowledge base completion impute missing facts embedding knowledge graphs vector spaces. show models recursively applied answer path queries, suffer cascading errors. motivates new ""compositional"" training objective, dramatically improves models' ability answer path queries, cases doubling accuracy. standard knowledge base completion task, also demonstrate compositional training acts novel form structural regularization, reliably improving performance across base models (reducing errors 43%) achieving new state-of-the-art results.",4 "rapid learning stochastic focus attention. present method stop evaluation decision making process result full evaluation obvious. trait highly desirable online margin-based machine learning algorithms classifier traditionally evaluates features every example. observe examples easier classify others, phenomenon characterized event features agree class example. stopping feature evaluation encountering easy classify example, learning algorithm achieve substantial gains computation. method provides natural attention mechanism learning algorithms. modifying pegasos, margin-based online learning algorithm, include attentive method lower number attributes computed $n$ average $o(\sqrt{n})$ features without loss prediction accuracy. demonstrate effectiveness attentive pegasos mnist data.",4 "novel parser design algorithm based artificial ants. article presents unique design parser using ant colony optimization algorithm. paper implements intuitive thought process human mind activities artificial ants. scheme presented uses bottom-up approach parsing program directly use ambiguous redundant grammars. allocate node corresponding production rule present given grammar. node connected nodes (representing production rules), thereby establishing completely connected graph susceptible movement artificial ants. ant tries modify sentential form production rule present node upgrades position sentential form reduces start symbol s. successful ants deposit pheromone links traversed through. eventually, optimum path discovered links carrying maximum amount pheromone concentration. design simple, versatile, robust effective obviates calculation mentioned sets precedence relation tables. advantages scheme lie i) ascertaining whether given string belongs language represented grammar, ii) finding shortest possible path given string start symbol case multiple routes exist.",4 "towards stability optimality stochastic gradient descent. iterative procedures parameter estimation based stochastic gradient descent allow estimation scale massive data sets. however, theory practice, suffer numerical instability. moreover, statistically inefficient estimators true parameter value. address two issues, propose new iterative procedure termed averaged implicit sgd (ai-sgd). statistical efficiency, ai-sgd employs averaging iterates, achieves optimal cram\'{e}r-rao bound strong convexity, i.e., optimal unbiased estimator true parameter value. numerical stability, ai-sgd employs implicit update iteration, related proximal operators optimization. practice, ai-sgd achieves competitive performance state-of-the-art procedures. furthermore, stable averaging procedures employ proximal updates, simple implement requires fewer tunable hyperparameters procedures employ proximal updates.",19 "collaborating robotics using nature-inspired meta-heuristics. paper introduces collaborating robots provide possibility enhanced task performance, high reliability decreased. collaborating-bots collection mobile robots able self-assemble self-organize order solve problems cannot solved single robot. robots combine power swarm intelligence flexibility self-reconfiguration aggregate collaborating-bots dynamically change structure match environmental variations. collaborating robots networks independent agents, potentially reconfigurable networks communicating agents capable coordinated sensing interaction environment. robots going important part future. collaborating robots limited individual capability, robots deployed large numbers represent strong force similar colony ants swarm bees. present mechanism collaborating robots based swarm intelligence ant colony optimization particle swarm optimization",4 "people media: jointly identifying credible news trustworthy citizen journalists online communities. media seems become partisan, often providing biased coverage news catering interest specific groups. therefore essential identify credible information content provides objective narrative event. news communities digg, reddit, newstrust offer recommendations, reviews, quality ratings, insights journalistic works. however, complex interaction different factors online communities: fairness style reporting, language clarity objectivity, topical perspectives (like political viewpoint), expertise bias community members, more. paper presents model systematically analyze different interactions news community users, news, sources. develop probabilistic graphical model leverages joint interaction identify 1) highly credible news articles, 2) trustworthy news sources, 3) expert users perform role ""citizen journalists"" community. method extends crf models incorporate real-valued ratings, communities fine-grained scales cannot easily discretized without losing information. best knowledge, paper first full-fledged analysis credibility, trust, expertise news communities.",4 "maximal large deviation inequality sub-gaussian variables. short note prove maximal concentration lemma sub-gaussian random variables stating independent sub-gaussian random variables \[p<(\max_{1\le i\le n}s_{i}>\epsilon>) \le\exp<(-\frac{1}{n^2}\sum_{i=1}^{n}\frac{\epsilon^{2}}{2\sigma_{i}^{2}}>), \] $s_i$ sum $i$ zero mean independent sub-gaussian random variables $\sigma_i$ variance $i$th random variable.",4 "local expectation gradients doubly stochastic variational inference. introduce local expectation gradients general purpose stochastic variational inference algorithm constructing stochastic gradients sampling variational distribution. algorithm divides problem estimating stochastic gradients multiple variational parameters smaller sub-tasks sub-task exploits intelligently information coming relevant part variational distribution. achieved performing exact expectation single random variable mostly correlates variational parameter interest resulting rao-blackwellized estimate low variance work efficiently continuous discrete random variables. furthermore, proposed algorithm interesting similarities gibbs sampling time, unlike gibbs sampling, trivially parallelized.",19 "cognitive principles robust multimodal interpretation. multimodal conversational interfaces provide natural means users communicate computer systems multiple modalities speech gesture. build effective multimodal interfaces, automated interpretation user multimodal inputs important. inspired previous investigation cognitive status multimodal human machine interaction, developed greedy algorithm interpreting user referring expressions (i.e., multimodal reference resolution). algorithm incorporates cognitive principles conversational implicature givenness hierarchy applies constraints various sources (e.g., temporal, semantic, contextual) resolve references. empirical results shown advantage algorithm efficiently resolving variety user references. simplicity generality, approach potential improve robustness multimodal input interpretation.",4 "conditional random fields support vector machines: hybrid approach. propose novel hybrid loss multiclass structured prediction problems convex combination log loss conditional random fields (crfs) multiclass hinge loss support vector machines (svms). provide sufficient condition hybrid loss fisher consistent classification. condition depends measure dominance labels - specifically, gap per observation probabilities likely labels. also prove fisher consistency necessary parametric consistency learning models crfs. demonstrate empirically hybrid loss typically performs least well - often better - constituent losses variety tasks. also provide empirical comparison efficacy probabilistic margin based approaches multiclass structured prediction effects label dominance results.",4 "greedy active learning algorithm logistic regression models. study logistic model-based active learning procedure binary classification problems, adopt batch subject selection strategy modified sequential experimental design method. moreover, accompanying proposed subject selection scheme, simultaneously conduct greedy variable selection procedure update classification model labeled training subjects. proposed algorithm repeatedly performs subject variable selection steps prefixed stopping criterion reached. numerical results show proposed procedure competitive performance, smaller training size compact model, comparing classifier trained variables full data set. also apply proposed procedure well-known wave data set (breiman et al., 1984) confirm performance method.",19 "interactively transferring cnn patterns part localization. scenario one/multi-shot learning, conventional end-to-end learning strategies without sufficient supervision usually powerful enough learn correct patterns noisy signals. thus, given cnn pre-trained object classification, paper proposes method first summarizes knowledge hidden inside cnn dictionary latent activation patterns, builds new model part localization manually assembling latent patterns related target part via human interactions. use (e.g., three) annotations semantic object part retrieve certain latent patterns conv-layers represent target part. visualize latent patterns ask users remove incorrect patterns, order refine part representation. guidance human interactions, method exhibited superior performance part localization experiments.",4 "comparative study cnn, bovw lbp classification histopathological images. despite progress made field medical imaging, remains large area open research, especially due variety imaging modalities disease-specific characteristics. paper comparative study describing potential using local binary patterns (lbp), deep features bag-of-visual words (bovw) scheme classification histopathological images. introduce new dataset, \emph{kimia path960}, contains 960 histopathology images belonging 20 different classes (different tissue types). make dataset publicly available. small size dataset inter- intra-class variability makes ideal initial investigations comparing image descriptors search classification complex medical imaging cases like histopathology. investigate deep features, lbp histograms bovw classify images via leave-one-out validation. accuracy image classification obtained using lbp 90.62\% highest accuracy using deep features reached 94.72\%. dictionary approach (bovw) achieved 96.50\%. deep solutions may able deliver higher accuracies need extensive training large number (balanced) image datasets.",4 "algorithms irrelevance-based partial maps. irrelevance-based partial maps useful constructs domain-independent explanation using belief networks. look two definitions partial maps, prove important properties useful designing algorithms computing effectively. make use properties modifying standard map best-first algorithm, handle irrelevance-based partial maps.",4 "principal manifolds nonlinear dimension reduction via local tangent space alignment. nonlinear manifold learning unorganized data points challenging unsupervised learning data visualization problem great variety applications. paper present new algorithm manifold learning nonlinear dimension reduction. based set unorganized data points sampled noise manifold, represent local geometry manifold using tangent spaces learned fitting affine subspace neighborhood data point. tangent spaces aligned give internal global coordinates data points respect underlying manifold way partial eigendecomposition neighborhood connection matrix. present careful error analysis algorithm show reconstruction errors second-order accuracy. illustrate algorithm using curves surfaces 2d/3d higher dimensional euclidean spaces, 64-by-64 pixel face images various pose lighting conditions. also address several theoretical algorithmic issues research improvements.",4 "detection algorithms communication systems using deep learning. design analysis communication systems typically rely development mathematical models describe underlying communication channel, dictates relationship transmitted received signals. however, systems, molecular communication systems chemical signals used transfer information, possible accurately model relationship. scenarios, lack mathematical channel models, completely new approach design analysis required. work, focus one important aspect communication systems, detection algorithms, demonstrate borrowing tools deep learning, possible train detectors perform well, without knowledge underlying channel models. evaluate algorithms using experimental data collected chemical communication platform, channel model unknown difficult model analytically. show deep learning algorithms perform significantly better simple detector used previous works, also assume knowledge channel.",4 low-rank representation manifold curves. machine learning common interpret data point vector euclidean space. however data may actually functional i.e.\ data point function variable time function discretely sampled. naive treatment functional data traditional multivariate data lead poor performance since algorithms ignoring correlation curvature function. paper propose method analyse subspace structure functional data using state art low-rank representation (lrr). experimental evaluation synthetic real data reveals method massively outperforms conventional lrr tasks concerning functional data.,4 "kernel sparse models automated tumor segmentation. paper, propose sparse coding-based approaches segmentation tumor regions mr images. sparse coding data-adapted dictionaries successfully employed several image recovery vision problems. proposed approaches obtain sparse codes pixel brain magnetic resonance images considering intensity values location information. since trivial obtain pixel-wise sparse codes, combining multiple features sparse coding setup straightforward, propose perform sparse coding high-dimensional feature space non-linear similarities effectively modeled. use training data expert-segmented images obtain kernel dictionaries kernel k-lines clustering procedure. test image, sparse codes computed kernel dictionaries, used identify tumor regions. approach completely automated, require user intervention initialize tumor regions test image. furthermore, low complexity segmentation approach based kernel sparse codes, allows user initialize tumor region, also presented. results obtained proposed approaches validated manual segmentation expert radiologist, proposed methods lead accurate tumor identification.",4 "linking image text 2-way nets. linking two data sources basic building block numerous computer vision problems. canonical correlation analysis (cca) achieves utilizing linear optimizer order maximize correlation two views. recent work makes use non-linear models, including deep learning techniques, optimize cca loss feature space. paper, introduce novel, bi-directional neural network architecture task matching vectors two data sources. approach employs two tied neural network channels project two views common, maximally correlated space using euclidean loss. show direct link correlation-based loss euclidean loss, enabling use euclidean loss correlation maximization. overcome common euclidean regression optimization problems, modify well-known techniques problem, including batch normalization dropout. show state art results number computer vision matching tasks including mnist image matching sentence-image matching flickr8k, flickr30k coco datasets.",4 "transforming wikipedia ontology-based information retrieval search engine local experts using third-party taxonomy. wikipedia widely used finding general information wide variety topics. vocation provide local information. example, provides plot, cast, production information given movie, showing times local movie theatre. describe connect local information wikipedia, without altering content. case study present involves finding local scientific experts. using third-party taxonomy, independent wikipedia's category hierarchy, index information connected local experts, present activity reports, re-index wikipedia content using taxonomy. connections wikipedia pages local expert reports stored relational database, accessible public sparql endpoint. wikipedia gadget (or plugin) activated interested user, accesses endpoint wikipedia page accessed. additional tab wikipedia page allows user open list teams local experts associated subject matter wikipedia page. technique, though presented way identify local experts, generic, third party taxonomy, used connect wikipedia non-wikipedia data source.",4 "reinforcement learning based active learning method. paper, new reinforcement learning approach proposed based powerful concept named active learning method (alm) modeling. alm expresses multi-input-single-output system fuzzy combination single-input-singleoutput systems. proposed method actor-critic system similar generalized approximate reasoning based intelligent control (garic) structure adapt alm delayed reinforcement signals. system uses temporal difference (td) learning model behavior useful actions control system. goodness action modeled reward- penalty-plane. ids planes updated according plane. shown system learn predefined fuzzy system without (through random actions).",4 "shape-based defect classification non destructive testing. aim work classify aerospace structure defects detected eddy current non-destructive testing. proposed method based assumption defect bound reaction probe coil impedance test. impedance plane analysis used extract feature vector shape coil impedance complex plane, use geometric parameters. shape recognition tested three different machine-learning based classifiers: decision trees, neural networks naive bayes. performance proposed detection system measured terms accuracy, sensitivity, specificity, precision matthews correlation coefficient. several experiments performed dataset eddy current signal samples aircraft structures. obtained results demonstrate usefulness approach competiveness existing descriptors.",4 "view-invariant template matching using homography constraints. change viewpoint one major factors variation object appearance across different images. thus, view-invariant object recognition challenging important image understanding task. paper, propose method match objects images taken different viewpoints. unlike methods literature, restriction camera orientations internal camera parameters imposed prior knowledge 3d structure object required. prove two cameras take pictures object two different viewing angels, relationship every quadruple points reduces special case homography two equal eigenvalues. based property, formulate problem error function indicates likely two sets 2d points projections set 3d points two different cameras. comprehensive set experiments conducted prove robustness method noise, evaluate performance real-world applications, face object recognition.",4 "harnessing cognitive features sarcasm detection. paper, propose novel mechanism enriching feature vector, task sarcasm detection, cognitive features extracted eye-movement patterns human readers. sarcasm detection challenging research problem, importance nlp applications review summarization, dialog systems sentiment analysis well recognized. sarcasm often traced incongruity becomes apparent full sentence unfolds. presence incongruity- implicit explicit- affects way readers eyes move text. observe difference behaviour eye, reading sarcastic non sarcastic sentences. motivated observation, augment traditional linguistic stylistic features sarcasm detection cognitive features obtained readers eye movement data. perform statistical classification using enhanced feature set obtained. augmented cognitive features improve sarcasm detection 3.7% (in terms f-score), performance best reported system.",4 "automatic quality assessment speech translation using joint asr mt features. paper addresses automatic quality assessment spoken language translation (slt). relatively new task defined formalized sequence labeling problem word slt hypothesis tagged good bad according large feature set. propose several word confidence estimators (wce) based automatic evaluation transcription (asr) quality, translation (mt) quality, (combined asr+mt). research work possible built specific corpus contains 6.7k utterances quintuplet containing: asr output, verbatim transcript, text translation, speech translation post-edition translation built. conclusion multiple experiments using joint asr mt features wce mt features remain influent asr feature bring interesting complementary information. robust quality estimators slt used re-scoring speech translation graphs providing feedback user interactive speech translation computer-assisted speech-to-text scenarios.",4 "object categorization finer levels requires higher spatial frequencies, therefore takes longer. human visual system contains hierarchical sequence modules take part visual perception different levels abstraction, i.e., superordinate, basic, subordinate levels. one important question identify ""entry"" level visual representation commenced process object recognition. long time, believed basic level advantage two others; claim challenged recently. used series psychophysics experiments, based rapid presentation paradigm, well two computational models, bandpass filtered images study processing order categorization levels. experiments, investigated type visual information required categorizing objects level varying spatial frequency bands input image. results psychophysics experiments computational models consistent. indicate different spatial frequency information different effects object categorization level. absence high frequency information, subordinate basic level categorization performed inaccurately, superordinate level performed well. means that, low frequency information sufficient superordinate level, basic subordinate levels. finer levels require high frequency information, appears take longer processed, leading longer reaction times. finally, avoid ceiling effect, evaluated robustness results adding different amounts noise input images repeating experiments. expected, categorization accuracy decreased reaction time increased significantly, trends same.this shows results due ceiling effect.",16 "denoising arterial spin labeling cerebral blood flow images using deep learning. arterial spin labeling perfusion mri noninvasive technique measuring quantitative cerebral blood flow (cbf), measurement subject low signal-to-noise-ratio(snr). various post-processing methods proposed denoise asl mri provide moderate improvement. deep learning (dl) emerging technique learn representative signal data without prior modeling highly complex analytically indescribable. purpose study assess whether record breaking performance dl translated asl mri denoising. used convolutional neural network (cnn) build dl asl denosing model (dl-asl) inherently consider inter-voxel correlations. better guide dl-asl training, incorporated prior knowledge asl mri: structural similarity asl cbf map grey matter probability map. relatively large sample data used train model subsequently applied new set data testing. experimental results showed dl-asl achieved state-of-the-art denoising performance asl mri compared current routine methods terms higher snr, keeping cbf quantification quality shorten acquisition time 75%, automatic partial volume correction.",4 "principled hybrids generative discriminative domain adaptation. propose probabilistic framework domain adaptation blends generative discriminative modeling principled way. framework, generative discriminative models correspond specific choices prior parameters. provides us general way interpolate generative discriminative extremes different choices priors. maximizing marginal conditional log-likelihoods, models derived framework use labeled instances source domain well unlabeled instances source target domains. framework, show popular reconstruction loss autoencoder corresponds upper bound negative marginal log-likelihoods unlabeled instances, marginal distributions given proper kernel density estimations. provides way interpret empirical success autoencoders domain adaptation semi-supervised learning. instantiate framework using neural networks, build concrete model, dauto. empirically, demonstrate effectiveness dauto text, image speech datasets, showing outperforms related competitors domain adaptation possible.",4 "continuous dr-submodular maximization: structure algorithms. dr-submodular continuous functions important objectives wide real-world applications spanning map inference determinantal point processes (dpps), mean-field inference probabilistic submodular models, amongst others. dr-submodularity captures subclass non-convex functions enables exact minimization approximate maximization polynomial time. work study problem maximizing non-monotone dr-submodular continuous functions general down-closed convex constraints. start investigating geometric properties underlie objectives, e.g., strong relation (approximately) stationary points global optimum proved. properties used devise two optimization algorithms provable guarantees. concretely, first devise ""two-phase"" algorithm $1/4$ approximation guarantee. algorithm allows use existing methods finding (approximately) stationary points subroutine, thus, harnessing recent progress non-convex optimization. present non-monotone frank-wolfe variant $1/e$ approximation guarantee sublinear convergence rate. finally, extend approach broader class generalized dr-submodular continuous functions, captures wider spectrum applications. theoretical findings validated synthetic real-world problem instances.",4 comparative study arithmetic constraints integer intervals. propose number approaches implement constraint propagation arithmetic constraints integer intervals. end introduce integer interval arithmetic. approach explained using appropriate proof rules reduce variable domains. compare approaches using set benchmarks.,4 "polarity detection movie reviews hindi language. nowadays peoples actively involved giving comments reviews social networking websites websites like shopping websites, news websites etc. large number people everyday share opinion web, results large number user data collected .users also find trivial task read reviews reached decision. would better reviews classified category user finds easier read. opinion mining sentiment analysis natural language processing task mines information various text forms reviews, news, blogs classify basis polarity positive, negative neutral. but, last years, user content hindi language also increasing rapid rate web. important perform opinion mining hindi language well. paper hindi language opinion mining system proposed. system classifies reviews positive, negative neutral hindi language. negation also handled proposed system. experimental results using reviews movies show effectiveness system",4 """maximizing rigidity"" revisited: convex programming approach generic 3d shape reconstruction multiple perspective views. rigid structure-from-motion (rsfm) non-rigid structure-from-motion (nrsfm) long treated literature separate (different) problems. inspired previous work solved directly 3d scene structure factoring relative camera poses out, revisit principle ""maximizing rigidity"" structure-from-motion literature, develop unified theory applicable rigid non-rigid structure reconstruction rigidity-agnostic way. formulate problems convex semi-definite program, imposing constraints seek apply principle minimizing non-rigidity. results demonstrate efficacy approach, state-of-the-art accuracy various 3d reconstruction problems.",4 "pushing point view: behavioral measures manipulation wikipedia. major source information virtually topic, wikipedia serves important role public dissemination consumption knowledge. result, presents tremendous potential people promulgate points view; efforts may subtle typical vandalism. paper, introduce new behavioral metrics quantify level controversy associated particular user: controversy score (c-score) based amount attention user focuses controversial pages, clustered controversy score (cc-score) also takes account topical clustering. show measures useful identifying people try ""push"" points view, showing good predictors editors get blocked. metrics used triage potential pov pushers. apply idea dataset users requested promotion administrator status easily identify editors significantly changed behavior upon becoming administrators. time, behavior rampant. promoted administrator status tend stable behavior comparable groups prolific editors. suggests adminship process works well, wikipedia community overwhelmed users become administrators promote points view.",4 "hybrid medical image classification using association rule mining decision tree algorithm. main focus image mining proposed method concerned classification brain tumor ct scan brain images. major steps involved system are: pre-processing, feature extraction, association rule mining hybrid classifier. pre-processing step done using median filtering process edge features extracted using canny edge detection technique. two image mining approaches hybrid manner proposed paper. frequent patterns ct scan images generated frequent pattern tree (fp-tree) algorithm mines association rules. decision tree method used classify medical images diagnosis. system enhances classification process accurate. hybrid method improves efficiency proposed method traditional image mining methods. experimental result prediagnosed database brain images showed 97% sensitivity 95% accuracy respectively. physicians make use accurate decision tree classification phase classifying brain images normal, benign malignant effective medical diagnosis.",4 "part-to-whole registration histology mri using shape elements. image registration histology magnetic resonance imaging (mri) challenging task due differences structural content contrast. thick wide specimens cannot processed must cut smaller pieces. dramatically increases complexity problem, since piece individually manually pre-aligned. best knowledge, automatic method reliably locate piece tissue within respective whole mri slice, align without prior information. propose novel automatic approach joint problem multimodal registration histology mri, fraction tissue available histology. approach relies representation images using level lines reach contrast invariance. shape elements obtained via extraction bitangents encoded projective-invariant manner, permits identification common pieces curves two images. evaluated approach human brain histology compared resulting alignments manually annotated ground truths. considering complexity brain folding patterns, preliminary results promising suggest use characteristic meaningful shape elements improved robustness efficiency.",4 "use tensorflow. google's machine learning framework tensorflow open-sourced november 2015 [1] since built growing community around it. tensorflow supposed flexible research purposes also allowing models deployed productively. work aimed towards people experience machine learning considering whether use tensorflow environment. several aspects framework important decision examined, heterogenity, extensibility computation graph. pure python implementation linear classification compared implementation utilizing tensorflow. also contrast tensorflow popular frameworks respect modeling capability, deployment performance give brief description current adaption framework.",4 "possibilistic assumption based truth maintenance system, validation data fusion application. data fusion allows elaboration evaluation situation synthesized low level informations provided different kinds sensors. fusion collected data result fewer higher level informations easily assessed human operator assist effectively decision process. paper present suitability advantages using possibilistic assumption based truth maintenance system (n-atms) data fusion military application. first describe problem, needed knowledge representation formalisms problem solving paradigms. remind reader basic concepts atmss, possibilistic logic 11-atmss. finally detail solution given data fusion problem conclude results comparison non-possibilistic solution.",4 "prepaid postpaid? question. novel methods subscription type prediction mobile phone services. paper investigate behavioural differences mobile phone customers prepaid postpaid subscriptions. study reveals (a) postpaid customers active terms service usage (b) strong structural correlations mobile phone call network connections customers subscription type much frequent customers different subscription types. based observations provide methods detect subscription type customers using information personal call statistics, also egocentric networks simultaneously. key first approach cast classification problem problem graph labelling, solved max-flow min-cut algorithms. experiments show that, using user attributes relationships, proposed graph labelling approach able achieve classification accuracy $\sim 87\%$, outperforms $\sim 7\%$ supervised learning methods using user attributes. second problem aim infer subscription type customers external operators. propose via approximate methods solve problem using node attributes, two-ways indirect inference method based observed homophilic structural correlations. results straightforward applications behavioural prediction personal marketing.",4 "reinforced video captioning entailment rewards. sequence-to-sequence models shown promising improvements temporal task video captioning, optimize word-level cross-entropy loss training. first, using policy gradient mixed-loss methods reinforcement learning, directly optimize sentence-level task-based metrics (as rewards), achieving significant improvements baseline, based automatic metrics human evaluation multiple datasets. next, propose novel entailment-enhanced reward (cident) corrects phrase-matching based metrics (such cider) allow logically-implied partial matches avoid contradictions, achieving significant improvements cider-reward model. overall, cident-reward model achieves new state-of-the-art msr-vtt dataset.",4 "deep reinforcement learning boosted external knowledge. recent improvements deep reinforcement learning allowed solve problems many 2d domains atari games. however, complex 3d environments, numerous learning episodes required may time consuming even impossible especially real-world scenarios. present new architecture combine external knowledge deep reinforcement learning using visual input. key concept system augmenting image input adding environment feature information combining two sources decision. evaluate performances method 3d partially-observable environment microsoft malmo platform. experimental evaluation exhibits higher performance faster learning compared single reinforcement learning model.",4 note kullback-leibler divergence von mises-fisher distribution. present derivation kullback leibler (kl)-divergence (also known relative entropy) von mises fisher (vmf) distribution $d$-dimensions.,19 "weighting scheme pairwise multi-label classifier based fuzzy confusion matrix. work addressed issue applying stochastic classifier local, fuzzy confusion matrix framework multi-label classification. proposed novel solution problem correcting label pairwise ensembles. main step correction procedure compute classifier-specific competence cross-competence measures, estimates error pattern underlying classifier. fusion phase employed two weighting approaches based information theory. classifier weights promote base classifiers susceptible correction based fuzzy confusion matrix. experimental study, proposed approach compared two reference methods. comparison made terms six different quality criteria. conducted experiments reveals proposed approach eliminates one main drawbacks original fcm-based approach i.e. original approach vulnerable imbalanced class/label distribution. more, obtained results shows introduced method achieves satisfying classification quality considered quality criteria. additionally, impact fluctuations data set characteristics reduced.",4 "scalable multi-class gaussian process classification using expectation propagation. paper describes expectation propagation (ep) method multi-class classification gaussian processes scales well large datasets. method estimate log-marginal-likelihood involves sum across data instances. enables efficient training using stochastic gradients mini-batches. type training used, computational cost depend number data instances $n$. furthermore, extra assumptions approximate inference process make memory cost independent $n$. consequence proposed ep method used datasets millions instances. compare empirically method alternative approaches approximate required computations using variational inference. results show performs similar even better techniques, sometimes give significantly worse predictive distributions terms test log-likelihood. besides this, training process proposed approach also seems converge smaller number iterations.",19 "predicting demographics high-resolution geographies geotagged tweets. paper, consider problem predicting demographics geographic units given geotagged tweets composed within units. traditional survey methods offer demographics estimates usually limited terms geographic resolution, geographic boundaries, time intervals. thus, would highly useful develop computational methods complement traditional survey methods offering demographics estimates finer geographic resolutions, flexible geographic boundaries (i.e. confined administrative boundaries), different time intervals. prior work focused predicting demographics health statistics relatively coarse geographic resolutions county-level state-level, introduce approach predict demographics finer geographic resolutions blockgroup-level. task predicting gender race/ethnicity counts blockgroup-level, approach adapted prior work problem achieves average correlation 0.389 (gender) 0.569 (race) held-out test dataset. approach outperforms prior approach average correlation 0.671 (gender) 0.692 (race).",4 "sparse signal subspace decomposition based adaptive over-complete dictionary. paper proposes subspace decomposition method based over-complete dictionary sparse representation, called ""sparse signal subspace decomposition"" (or 3sd) method. method makes use novel criterion based occurrence frequency atoms dictionary data set. criterion, well adapted subspace-decomposition dependent basis set, adequately ects intrinsic characteristic regularity signal. 3sd method combines variance, sparsity component frequency criteria unified framework. takes benefits using over-complete dictionary preserves details subspace decomposition rejects strong noise. 3sd method simple linear retrieval operation. require prior knowledge distributions parameters. applied image denoising, demonstrates high performances preserving fine details suppressing strong noise.",19 "solving multistage influence diagrams using branch-and-bound search. branch-and-bound approach solving influ- ence diagrams previously proposed literature, appears never implemented evaluated - apparently due difficulties computing effective bounds branch-and-bound search. paper, describe efficiently compute effective bounds, develop practical implementa- tion depth-first branch-and-bound search influence diagram evaluation outperforms existing methods solving influence diagrams multiple stages.",4 "training adaptive dialogue policy interactive learning visually grounded word meanings. present multi-modal dialogue system interactive learning perceptually grounded word meanings human tutor. system integrates incremental, semantic parsing/generation framework - dynamic syntax type theory records (ds-ttr) - set visual classifiers learned throughout interaction ground meaning representations produces. use system interaction simulated human tutor study effects different dialogue policies capabilities accuracy learned meanings, learning rates, efforts/costs tutor. show overall performance learning agent affected (1) takes initiative dialogues; (2) ability express/use confidence level visual attributes; (3) ability process elliptical incrementally constructed dialogue turns. ultimately, train adaptive dialogue policy optimises trade-off classifier accuracy tutoring costs.",4 "neural multi-task learning automated assessment. grammatical error detection automated essay scoring two tasks area automated assessment. traditionally tasks treated independently different machine learning models features used task. paper, develop multi-task neural network model jointly optimises tasks, particular show neural automated essay scoring significantly improved. show essay score provides little evidence inform grammatical error detection, essay score highly influenced error detection.",4 "structured pruning deep convolutional neural networks. real time application deep learning algorithms often hindered high computational complexity frequent memory accesses. network pruning promising technique solve problem. however, pruning usually results irregular network connections demand extra representation efforts also fit well parallel computation. introduce structured sparsity various scales convolutional neural networks, channel wise, kernel wise intra kernel strided sparsity. structured sparsity advantageous direct computational resource savings embedded computers, parallel computing environments hardware based systems. decide importance network connections paths, proposed method uses particle filtering approach. importance weight particle assigned computing misclassification rate corresponding connectivity pattern. pruned network re-trained compensate losses due pruning. implementing convolutions matrix products, particularly show intra kernel strided sparsity simple constraint significantly reduce size kernel feature map matrices. pruned network finally fixed point optimized reduced word length precision. results significant reduction total storage size providing advantages on-chip memory based implementations deep neural networks.",4 "fuzzy - rough feature selection π- membership function mammogram classification. breast cancer second leading cause death among women diagnosed help mammograms. oncologists miserably failed identifying micro calcification early stage help mammogram visually. order improve performance breast cancer screening, researchers proposed computer aided diagnosis using image processing. study mammograms preprocessed features extracted, abnormality identified classification. extracted features used, cases misidentified. hence feature selection procedure sought. paper, fuzzy-rough feature selection {\pi} membership function proposed. selected features used classify abnormalities help ant-miner weka tools. experimental analysis shows proposed method improves mammograms classification accuracy.",4 "peduncle detection sweet pepper autonomous crop harvesting - combined colour 3d information. paper presents 3d visual detection method challenging task detecting peduncles sweet peppers (capsicum annuum) field. cutting peduncle cleanly one difficult stages harvesting process, peduncle part crop attaches main stem plant. accurate peduncle detection 3d space therefore vital step reliable autonomous harvesting sweet peppers, lead precise cutting avoiding damage surrounding plant. paper makes use colour geometry information acquired rgb-d sensor utilises supervised-learning approach peduncle detection task. performance proposed method demonstrated evaluated using qualitative quantitative results (the area-under-the-curve (auc) detection precision-recall curve). able achieve auc 0.71 peduncle detection field-grown sweet peppers. release set manually annotated 3d sweet pepper peduncle images assist research community performing research topic.",4 "large-scale video classification guided batch normalized lstm translator. youtube-8m dataset enhances development large-scale video recognition technology imagenet dataset encouraged image classification, recognition detection artificial intelligence fields. large video dataset, challenging task classify huge amount multi-labels. change perspective, propose novel method regarding labels words. details, describe online learning approaches multi-label video classification guided deep recurrent neural networks video sentence translator. designed translator based lstms found stochastic gating input lstm cell help us design structural details. addition, adopted batch normalizations models improve lstm models. since models feature extractors, used classifiers. finally report improved validation results models large-scale youtube-8m datasets discussions improvement.",4 "(not) train generative model: scheduled sampling, likelihood, adversary?. modern applications progress deep learning research created renewed interest generative models text images. however, even today unclear objective functions one use train evaluate models. paper present two contributions. firstly, present critique scheduled sampling, state-of-the-art training method contributed winning entry mscoco image captioning benchmark 2015. show despite impressive empirical performance, objective function underlying scheduled sampling improper leads inconsistent learning algorithm. secondly, revisit problems scheduled sampling meant address, present alternative interpretation. argue maximum likelihood inappropriate training objective end-goal generate natural-looking samples. go derive ideal objective function use situation instead. introduce generalisation adversarial training, show method interpolate maximum likelihood training ideal training objective. knowledge first theoretical analysis explains adversarial training tends produce samples higher perceived quality.",19 "extreme clicking efficient object annotation. manually annotating object bounding boxes central building computer vision datasets, time consuming (annotating ilsvrc [53] took 35s one high-quality box [62]). involves clicking imaginary corners tight box around object. difficult corners often outside actual object several adjustments required obtain tight box. propose extreme clicking instead: ask annotator click four physical points object: top, bottom, left- right-most points. task natural points easy find. crowd-source extreme point annotations pascal voc 2007 2012 show (1) annotation time 7s per box, 5x faster traditional way drawing boxes [62]; (2) quality boxes good original ground-truth drawn traditional way; (3) detectors trained annotations accurate trained original ground-truth. moreover, extreme clicking strategy yields box coordinates, also four accurate boundary points. show (4) incorporate grabcut obtain accurate segmentations delivered initializing bounding boxes; (5) semantic segmentations models trained segmentations outperform trained segmentations derived bounding boxes.",4 "algorithms computing greatest simulations bisimulations fuzzy automata. recently, two types simulations (forward backward simulations) four types bisimulations (forward, backward, forward-backward, backward-forward bisimulations) fuzzy automata introduced. least one simulation/bisimulation types given fuzzy automata, proved greatest simulation/bisimulation kind. present paper, above-mentioned types simulations/bisimulations provide effective algorithm deciding whether simulation/bisimulation type given fuzzy automata, computing greatest one, whenever exists. algorithms based method developed [j. ignjatovi\'c, m. \'ciri\'c, s. bogdanovi\'c, greatest solutions certain systems fuzzy relation inequalities equations, fuzzy sets systems 161 (2010) 3081-3113], comes computing greatest post-fixed point, contained given fuzzy relation, isotone function lattice fuzzy relations.",4 "nonlinear metric learning knn svms geometric transformations. recent years, research efforts extend linear metric learning models handle nonlinear structures attracted great interests. paper, propose novel nonlinear solution utilization deformable geometric models learn spatially varying metrics, apply strategy boost performance knn svm classifiers. thin-plate splines (tps) chosen geometric model due remarkable versatility representation power accounting high-order deformations. transforming input space tps, pull same-class neighbors closer pushing different-class points farther away knn, well make input data points linearly separable svms. improvements performance knn classification demonstrated experiments synthetic real world datasets, comparisons made several state-of-the-art metric learning solutions. svm-based models also achieve significant improvements traditional linear kernel svms datasets.",4 "learning paraphrase: unsupervised approach using multiple-sequence alignment. address text-to-text generation problem sentence-level paraphrasing -- phenomenon distinct difficult word- phrase-level paraphrasing. approach applies multiple-sequence alignment sentences gathered unannotated comparable corpora: learns set paraphrasing patterns represented word lattice pairs automatically determines apply patterns rewrite new sentences. results evaluation experiments show system derives accurate paraphrases, outperforming baseline systems.",4 "commonly uncommon: semantic sparsity situation recognition. semantic sparsity common challenge structured visual classification problems; output space complex, vast majority possible predictions rarely, ever, seen training set. paper studies semantic sparsity situation recognition, task producing structured summaries happening images, including activities, objects roles objects play within activity. problem, find empirically object-role combinations rare, current state-of-the-art models significantly underperform sparse data regime. avoid many errors (1) introducing novel tensor composition function learns share examples across role-noun combinations (2) semantically augmenting training data automatically gathered examples rarely observed outputs using web data. integrated within complete crf-based structured prediction model, tensor-based approach outperforms existing state art relative improvement 2.11% 4.40% top-5 verb noun-role accuracy, respectively. adding 5 million images semantic augmentation techniques gives relative improvements 6.23% 9.57% top-5 verb noun-role accuracy.",4 "deepqa: improving estimation single protein model quality deep belief networks. protein quality assessment (qa) ranking selecting protein models long viewed one major challenges protein tertiary structure prediction. especially, estimating quality single protein model, important selecting good models large model pool consisting mostly low-quality models, still largely unsolved problem. introduce novel single-model quality assessment method deepqa based deep belief network utilizes number selected features describing quality model different perspectives, energy, physio-chemical characteristics, structural information. deep belief network trained several large datasets consisting models critical assessment protein structure prediction (casp) experiments, several publicly available datasets, models generated in-house ab initio method. experiment demonstrate deep belief network better performance compared support vector machines neural networks protein model quality assessment problem, method deepqa achieves state-of-the-art performance casp11 dataset. also outperformed two well-established methods selecting good outlier models large set models mostly low quality generated ab initio modeling methods. deepqa useful tool protein single model quality assessment protein structure prediction. source code, executable, document training/test datasets deepqa linux freely available non-commercial users http://cactus.rnet.missouri.edu/deepqa/.",4 "extending object-oriented languages declarative specifications complex objects using answer-set programming. many applications require complexly structured data objects. developing new adapting existing algorithmic solutions creating objects non-trivial costly task considered objects subject different application-specific constraints. often, however, comparatively easy declaratively describe required objects. paper, propose use answer-set programming (asp)---a well-established declarative programming paradigm area artificial intelligence---for instantiating objects standard object-oriented programming languages. particular, extend java declarative specifications required objects automatically generated using available asp solver technology.",4 "vicious circle principle formation sets asp based languages. paper continues investigation poincare russel's vicious circle principle (vcp) context design logic programming languages sets. expand previously introduced language alog aggregates allowing infinite sets several additional set related constructs useful knowledge representation teaching. addition, propose alternative formalization original vcp incorporate semantics new language, slog+, allows liberal construction sets use programming rules. show that, programs without disjunction infinite sets, formal semantics aggregates slog+ coincides several known languages. intuitive formal semantics, however, based quite different ideas seem involved slog+.",4 "stochastic proximal gradient descent nuclear norm regularization. paper, utilize stochastic optimization reduce space complexity convex composite optimization nuclear norm regularizer, variable matrix size $m \times n$. constructing low-rank estimate gradient, propose iterative algorithm based stochastic proximal gradient descent (spgd), take last iterate spgd final solution. main advantage proposed algorithm space complexity $o(m+n)$, contrast, previous algorithms $o(mn)$ space complexity. theoretical analysis shows achieves $o(\log t/\sqrt{t})$ $o(\log t/t)$ convergence rates general convex functions strongly convex functions, respectively.",4 "last-step regression algorithm non-stationary online learning. goal learner standard online learning maintain average loss close loss best-performing single function class. many real-world problems, rating ranking items, single best target function runtime algorithm, instead best (local) target function drifting time. develop novel last-step minmax optimal algorithm context drift. analyze algorithm worst-case regret framework show maintains average loss close best slowly changing sequence linear functions, long total drift sublinear. situations, bound improves existing bounds, additionally algorithm suffers logarithmic regret drift. also build h_infinity filter bound, develop analyze second algorithm drifting setting. synthetic simulations demonstrate advantages algorithms worst-case constant drift setting.",4 "combat models rts games. game tree search algorithms, monte carlo tree search (mcts), require access forward model (or ""simulator"") game hand. however, games forward model readily available. paper presents three forward models two-player attrition games, call ""combat models"", show used simulate combat rts games. also show combat models learned replay data. use starcraft application domain. report experiments comparing combat models predicting combat output impact used tactical decisions real game.",4 "supervised learning similarity functions. address problem general supervised learning data accessed (indefinite) similarity function data points. existing work learning indefinite kernels concentrated solely binary/multi-class classification problems. propose model generic enough handle supervised learning task also subsumes model previously proposed classification. give ""goodness"" criterion similarity functions w.r.t. given supervised learning task adapt well-known landmarking technique provide efficient algorithms supervised learning using ""good"" similarity functions. demonstrate effectiveness model three important super-vised learning problems: a) real-valued regression, b) ordinal regression c) ranking show method guarantees bounded generalization error. furthermore, case real-valued regression, give natural goodness definition that, used conjunction recent result sparse vector recovery, guarantees sparse predictor bounded generalization error. finally, report results learning algorithms regression ordinal regression tasks using non-psd similarity functions demonstrate effectiveness algorithms, especially sparse landmark selection algorithm achieves significantly higher accuracies baseline methods offering reduced computational costs.",4 "modeling uncertain temporal evolutions model-based diagnosis. although notion diagnostic problem extensively investigated context static systems, practical applications behavior modeled system significantly variable time. goal paper propose novel approach modeling uncertainty temporal evolutions time-varying systems characterization model-based temporal diagnosis. since real world cases knowledge temporal evolution system diagnosed uncertain, consider case probabilistic temporal knowledge available component system choose model means markov chains. fact, aim exploiting statistical assumptions underlying reliability theory context diagnosis timevarying systems. finally show exploit markov chain theory order discard, diagnostic process, unlikely diagnoses.",4 "markov decision processes continuous side information. consider reinforcement learning (rl) setting agent interacts sequence episodic mdps. start episode agent access side-information context determines dynamics mdp episode. setting motivated applications healthcare baseline measurements patient start treatment episode form context may provide information patient might respond treatment decisions. propose algorithms learning contextual markov decision processes (cmdps) assumption unobserved mdp parameters vary smoothly observed context. also give lower upper pac bounds smoothness assumption. lower bound exponential dependence dimension, consider tractable linear setting context used create linear combinations finite set mdps. linear setting, give pac learning algorithm based kwik learning techniques.",19 "new optimal stepsize approximate dynamic programming. approximate dynamic programming (adp) proven wide range applications spanning large-scale transportation problems, health care, revenue management, energy systems. design effective adp algorithms many dimensions, one crucial factor stepsize rule used update value function approximation. many operations research applications computationally intensive, important obtain good results quickly. furthermore, popular stepsize formulas use tunable parameters produce poor results tuned improperly. derive new stepsize rule optimizes prediction error order improve short-term performance adp algorithm. one, relatively insensitive tunable parameter, new rule adapts level noise problem produces faster convergence numerical experiments.",12 "factorization discrete probability distributions. formulate necessary sufficient conditions arbitrary discrete probability distribution factor according undirected graphical model, log-linear model, general exponential models. result generalizes well known hammersley-clifford theorem.",4 "detection unauthorized iot devices using machine learning techniques. security experts demonstrated numerous risks imposed internet things (iot) devices organizations. due widespread adoption devices, diversity, standardization obstacles, inherent mobility, organizations require intelligent mechanism capable automatically detecting suspicious iot devices connected networks. particular, devices included white list trustworthy iot device types (allowed used within organizational premises) detected. research, random forest, supervised machine learning algorithm, applied features extracted network traffic data aim accurately identifying iot device types white list. train evaluate multi-class classifiers, collected manually labeled network traffic data 17 distinct iot devices, representing nine types iot devices. based classification 20 consecutive sessions use majority rule, iot device types white list correctly detected unknown 96% test cases (on average), white listed device types correctly classified actual types 99% cases. iot device types identified quicker others (e.g., sockets thermostats successfully detected within five tcp sessions connecting network). perfect detection unauthorized iot device types achieved upon analyzing 110 consecutive sessions; perfect classification white listed types required 346 consecutive sessions, 110 resulted 99.49% accuracy. experiments demonstrated successful applicability classifiers trained one location tested another. addition, discussion provided regarding resilience machine learning-based iot white listing method adversarial attacks.",4 "data mining prediction human performance capability software-industry. recruitment new personnel one essential business processes affect quality human capital within company. highly essential companies ensure recruitment right talent maintain competitive edge others market. however companies often face problem recruiting new people ongoing projects due lack proper framework defines criteria selection process. paper aim develop framework would allow project manager take right decision selecting new talent correlating performance parameters domain-specific attributes candidates. also, another important motivation behind project check validity selection procedure often followed various big companies public private sectors focus academic scores, gpa/grades students colleges academic backgrounds. test decision produce optimal results industry need change offers holistic approach recruitment new talent software companies. scope work extends beyond domain similar procedure adopted develop recruitment framework fields well. data-mining techniques provide useful information historical projects depending hiring-manager make decisions recruiting high-quality workforce. study aims bridge hiatus developing data-mining framework based ensemble-learning technique refocus criteria personnel selection. results research clearly demonstrated need refocus selection-criteria quality objectives.",4 "combining multiple time series models robust weighted mechanism. improvement time series forecasting accuracy combining multiple models important well dynamic area research. result, various forecasts combination methods developed literature. however, based simple linear ensemble strategies hence ignore possible relationships two participating models. paper, propose robust weighted nonlinear ensemble technique considers individual forecasts different models well correlations among combining. proposed ensemble constructed using three well-known forecasting models tested three real-world time series. comparison made among proposed scheme three widely used linear combination methods, terms obtained forecast errors. comparison shows ensemble scheme provides significantly lower forecast errors individual model well four linear combination methods.",4 ranking sentences extractive summarization reinforcement learning. single document summarization task producing shorter version document preserving principal information content. paper conceptualize extractive summarization sentence ranking task propose novel training algorithm globally optimizes rouge evaluation metric reinforcement learning objective. use algorithm train neural summarization model cnn dailymail datasets demonstrate experimentally outperforms state-of-the-art extractive abstractive systems evaluated automatically humans.,4 "vqs: linking segmentations questions answers supervised attention vqa question-focused semantic segmentation. rich dense human labeled datasets among main enabling factors recent advance vision-language understanding. many seemingly distant annotations (e.g., semantic segmentation visual question answering (vqa)) inherently connected reveal different levels perspectives human understandings visual scenes --- even set images (e.g., coco). popularity coco correlates annotations tasks. explicitly linking may significantly benefit individual tasks unified vision language modeling. present preliminary work linking instance segmentations provided coco questions answers (qas) vqa dataset, name collected links visual questions segmentation answers (vqs). transfer human supervision previously separate tasks, offer effective leverage existing problems, also open door new research problems models. study two applications vqs data paper: supervised attention vqa novel question-focused semantic segmentation task. former, obtain state-of-the-art results vqa real multiple-choice task simply augmenting multilayer perceptrons attention features learned using segmentation-qa links explicit supervision. put latter perspective, study two plausible methods compare oracle method assuming instance segmentations given test stage.",4 "deep transfer learning: new deep learning glitch classification method advanced ligo. exquisite sensitivity advanced ligo detectors enabled detection multiple gravitational wave signals. sophisticated design detectors mitigates effect types noise. however, advanced ligo data streams contaminated numerous artifacts known glitches: non-gaussian noise transients complex morphologies. given high rate occurrence, glitches lead false coincident detections, obscure even mimic gravitational wave signals. therefore, successfully characterizing removing glitches advanced ligo data utmost importance. here, present first application deep transfer learning glitch classification, showing knowledge deep learning algorithms trained real-world object recognition transferred classifying glitches time-series based spectrogram images. using gravity spy dataset, containing hand-labeled, multi-duration spectrograms obtained real ligo data, demonstrate method enables optimal use deep convolutional neural networks classification given small training datasets, significantly reduces time training networks, achieves state-of-the-art accuracy 98.8%, perfect precision-recall 8 22 classes. furthermore, new types glitches classified accurately given labeled examples technique. trained via transfer learning, show convolutional neural networks truncated used excellent feature extractors unsupervised clustering methods identify new classes based morphology, without labeled examples. therefore, provides new framework dynamic glitch classification gravitational wave detectors, expected encounter new types noise undergo gradual improvements attain design sensitivity.",7 "hand keypoint detection single images using multiview bootstrapping. present approach uses multi-camera system train fine-grained detectors keypoints prone occlusion, joints hand. call procedure multiview bootstrapping: first, initial keypoint detector used produce noisy labels multiple views hand. noisy detections triangulated 3d using multiview geometry marked outliers. finally, reprojected triangulations used new labeled training data improve detector. repeat process, generating labeled data iteration. derive result analytically relating minimum number views achieve target true false positive rates given detector. method used train hand keypoint detector single images. resulting keypoint detector runs realtime rgb images accuracy comparable methods use depth sensors. single view detector, triangulated multiple views, enables 3d markerless hand motion capture complex object interactions.",4 "minimalist grammars minimalist categorial grammars, definitions toward inclusion generated languages. stabler proposes implementation chomskyan minimalist program, chomsky 95 minimalist grammars - mg, stabler 97. framework inherits long linguistic tradition. semantic calculus easily added one uses curry-howard isomorphism. minimalist categorial grammars - mcg, based extension lambek calculus, mixed logic, introduced provide theoretically-motivated syntax-semantics interface, amblard 07. article, give full definitions mg algebraic tree descriptions mcg, take first steps towards giving proof inclusion generated languages.",4 "communication-efficient algorithm distributed sparse learning via two-way truncation. propose communicationally computationally efficient algorithm high-dimensional distributed sparse learning. iteration, local machines compute gradient local data master machine solves one shifted $l_1$ regularized minimization problem. communication cost reduced constant times dimension number state-of-the-art algorithm constant times sparsity number via two-way truncation procedure. theoretically, prove estimation error proposed algorithm decreases exponentially matches centralized method mild assumptions. extensive experiments simulated data real data verify proposed algorithm efficient performance comparable centralized method solving high-dimensional sparse learning problems.",19 "sentrna: improving computational rna design incorporating prior human design strategies. designing rna sequences fold specific structures perform desired biological functions emerging field bioengineering broad applications intracellular chemical catalysis cancer therapy via selective gene silencing. effective rna design requires first solving inverse folding problem: given target structure, propose sequence folds structure. although significant progress made developing computational algorithms purpose, current approaches ineffective designing sequences complex targets, limiting utility real-world applications. however, alternative shown significantly higher performance human players online rna design game eterna. many rounds gameplay, players developed collective library ""human"" rules strategies rna design proven effective current computational approaches, especially complex targets. here, present rna design agent, sentrna, consists fully-connected neural network trained using $eternasolves$ dataset, set $1.8 x 10^4$ player-submitted sequences across 724 unique targets. agent first predicts initial sequence target using trained network, refines solution necessary using short adaptive walk utilizing canon standard design moves. approach, observe sentrna learn apply human-like design strategies solve several complex targets previously unsolvable computational approach. thus demonstrate incorporating prior human design strategies computational agent significantly boost performance, suggests new paradigm machine-based rna design.",16 "towards closing energy gap hog cnn features embedded vision. computer vision enables wide range applications robotics/drones, self-driving cars, smart internet things, portable/wearable electronics. many applications, local embedded processing preferred due privacy and/or latency concerns. accordingly, energy-efficient embedded vision hardware delivering real-time robust performance crucial. deep learning gaining popularity several computer vision algorithms, significant energy consumption difference exists compared traditional hand-crafted approaches. paper, provide in-depth analysis computation, energy accuracy trade-offs learned features deep convolutional neural networks (cnn) hand-crafted features histogram oriented gradients (hog). analysis supported measurements two chips implement algorithms. goal understand source energy discrepancy two approaches provide insight potential areas cnns improved eventually approach energy-efficiency hog maintaining outstanding performance accuracy.",4 "do's don'ts cnn-based face verification. research community appears developed consensus methods acquiring annotated data, design training cnns, many questions still remain answered. paper, explore following questions critical face recognition research: (i) train still images expect systems work videos? (ii) deeper datasets better wider datasets? (iii) adding label noise lead improvement performance deep networks? (iv) alignment needed face recognition? address questions training cnns using casia-webface, umdfaces, new video dataset testing youtube- faces, ijb-a disjoint portion umdfaces datasets. new data set, made publicly available, 22,075 videos 3,735,476 human annotated frames extracted them.",4 "click here: human-localized keypoints guidance viewpoint estimation. motivate address human-in-the-loop variant monocular viewpoint estimation task location class one semantic object keypoint available test time. order leverage keypoint information, devise convolutional neural network called click-here cnn (ch-cnn) integrates keypoint information activations layers process image. transforms keypoint information 2d map used weigh features certain parts image heavily. weighted sum spatial features combined global image features provide relevant information prediction layers. train network, collect novel dataset 3d keypoint annotations thousands cad models, synthetically render millions images 2d keypoint information. test instances pascal 3d+, model achieves mean class accuracy 90.7%, whereas state-of-the-art baseline obtains 85.7% mean class accuracy, justifying argument human-in-the-loop inference.",4 "regularized richardson-lucy algorithm sparse reconstruction poissonian images. restoration digital images degraded measurements always problem great theoretical practical importance numerous applications imaging sciences. specific solution problem image restoration generally determined nature degradation phenomenon well statistical properties measurement noises. present study concerned case images interest corrupted convolutional blurs poisson noises. deal problems, exists range solution methods based principles originating fixed-point algorithm richardson lucy (rl). paper, provide conceptual experimental proof methods tend converge sparse solutions, makes applicable images represented relatively small number non-zero samples spatial domain. unfortunately, set images relatively small, restricts applicability rl-type methods. hand, virtually practical images admit sparse representations domain properly designed linear transform. take advantage fact, therefore tempting modify rl algorithm make recover representation coefficients, rather values associated image. modification introduced paper. apart generality assumptions, proposed method also superior many established reconstruction approaches terms estimation accuracy computational complexity. conclusions study validated series numerical experiments.",4 "gender identity lexical variation social media. present study relationship gender, linguistic style, social networks, using novel corpus 14,000 twitter users. prior quantitative work gender often treats social variable female/male binary; argue nuanced approach. clustering twitter users, find natural decomposition dataset various styles topical interests. many clusters strong gender orientations, use linguistic resources sometimes directly conflicts population-level language statistics. view clusters accurate reflection multifaceted nature gendered language styles. previous corpus-based work also little say individuals whose linguistic styles defy population-level gender patterns. identify individuals, train statistical classifier, measure classifier confidence individual dataset. examining individuals whose language match classifier's model gender, find social networks include significantly fewer same-gender social connections that, general, social network homophily correlated use same-gender language markers. pairing computational methods social theory thus offers new perspective gender emerges individuals position relative audiences, topics, mainstream gender norms.",4 "deep generative deconvolutional image model. deep generative model developed representation analysis images, based hierarchical convolutional dictionary-learning framework. stochastic {\em unpooling} employed link consecutive layers model, yielding top-down image generation. bayesian support vector machine linked top-layer features, yielding max-margin discrimination. deep deconvolutional inference employed testing, infer latent features, top-layer features connected max-margin classifier discrimination tasks. model efficiently trained using monte carlo expectation-maximization (mcem) algorithm, implementation graphical processor units (gpus) efficient large-scale learning, fast testing. excellent results obtained several benchmark datasets, including imagenet, demonstrating proposed model achieves results highly competitive similarly sized convolutional neural networks.",4 "smoothing stochastic gradient method composite optimization. consider unconstrained optimization problem whose objective function composed smooth non-smooth conponents smooth component expectation random function. type problem arises interesting applications machine learning. propose stochastic gradient descent algorithm class optimization problem. non-smooth component particular structure, propose another stochastic gradient descent algorithm incorporating smoothing method first algorithm. proofs convergence rates two algorithms given show numerical performance algorithm applying regularized linear regression problems different sets synthetic data.",12 "skin lesion segmentation: u-nets versus clustering. many automatic skin lesion diagnosis systems use segmentation preprocessing step diagnose skin conditions skin lesion shape, border irregularity, size influence likelihood malignancy. paper presents, examines compares two different approaches skin lesion segmentation. first approach uses u-nets introduces histogram equalization based preprocessing step. second approach c-means clustering based approach much simpler implement faster execute. jaccard index algorithm output hand segmented images dermatologists used evaluate proposed algorithms. many recently proposed deep neural networks segment skin lesions require significant amount computational power training (i.e., computer gpus), main objective paper present methods used cpu. severely limits, example, number training instances presented u-net. comparing two proposed algorithms, u-nets achieved significantly higher jaccard index compared clustering approach. moreover, using histogram equalization preprocessing step significantly improved u-net segmentation results.",4 "pixel deconvolutional networks. deconvolutional layers widely used variety deep models up-sampling, including encoder-decoder networks semantic segmentation deep generative models unsupervised learning. one key limitations deconvolutional operations result so-called checkerboard problem. caused fact direct relationship exists among adjacent pixels output feature map. address problem, propose pixel deconvolutional layer (pixeldcl) establish direct relationships among adjacent pixels up-sampled feature map. method based fresh interpretation regular deconvolution operation. resulting pixeldcl used replace deconvolutional layer plug-and-play manner without compromising fully trainable capabilities original models. proposed pixeldcl may result slight decrease efficiency, overcome implementation trick. experimental results semantic segmentation demonstrate pixeldcl consider spatial features edges shapes yields accurate segmentation outputs deconvolutional layers. used image generation tasks, pixeldcl largely overcome checkerboard problem suffered regular deconvolution operations.",4 "neural network assembly memory model based optimal binary signal detection theory. ternary/binary data coding algorithm conditions hopfield networks implement optimal convolutional hamming decoding algorithms described. using coding/decoding approach (an optimal binary signal detection theory, bsdt) introduced neural network assembly memory model (nnamm) built. model provides optimal (the best) basic memory performance demands use new memory unit architecture two-layer hopfield network, n-channel time gate, auxiliary reference memory, two nested feedback loops. nnamm explicitly describes dependence time memory trace retrieval, gives possibility metamemory simulation, generalized knowledge representation, distinct description conscious unconscious mental processes. model smallest inseparable part ""atom"" consciousness also defined. nnamm's neurobiological backgrounds applications solving interdisciplinary problems shortly discussed. bsdt could implement ""best neural code"" used nervous tissues animals humans.",4 "integrating topic models latent factors recommendation. research personalized recommendation techniques today mostly parted two mainstream directions, i.e., factorization-based approaches topic models. practically, aim benefit numerical ratings textual reviews, correspondingly, compose two major information sources various real-world systems. however, although two approaches supposed correlated goal accurate recommendation, still lacks clear theoretical understanding objective functions mathematically bridged leverage numerical ratings textual reviews collectively, bridge intuitively reasonable match learning procedures rating prediction top-n recommendation tasks, respectively. work, exposit mathematical analysis that, vector-level randomization functions coordinate optimization objectives factorizational topic models unfortunately exist all, although usually pre-assumed intuitively designed literature. fortunately, also point one avoid seeking randomization function optimizing joint factorizational topic (jft) model directly. apply jft model restaurant recommendation, study performance normal cross-city recommendation scenarios, latter extremely difficult task inherent cold-start nature. experimental results real-world datasets verified appealing performance approach previous methods, rating prediction top-n recommendation tasks.",4 "proficiency comparison ladtree reptree classifiers credit risk forecast. predicting credit defaulter perilous task financial industries like banks. ascertaining non-payer giving loan significant conflict-ridden task banker. classification techniques better choice predictive analysis like finding claimant, whether he/she unpretentious customer cheat. defining outstanding classifier risky assignment industrialist like banker. allow computer science researchers drill efficient research works evaluating different classifiers finding best classifier predictive problems. research work investigates productivity ladtree classifier reptree classifier credit risk prediction compares fitness various measures. german credit dataset taken used predict credit risk help open source machine learning tool.",4 "quantified multimodal logics simple type theory. present straightforward embedding quantified multimodal logic simple type theory prove soundness completeness. modal operators replaced quantification type possible worlds. present simple experiments, using existing higher-order theorem provers, demonstrate embedding allows automated proofs statements logics, well meta properties them.",4 flip-flop sublinear models graphs: proof theorem 1. prove class-dual almost sublinear models graphs.,4 "stability video detection tracking. paper, study important yet less explored aspect video detection tracking -- stability. surprisingly, prior work tried study it. result, start work proposing novel evaluation metric video detection considers stability accuracy. accuracy, extend existing accuracy metric mean average precision (map). stability, decompose three terms: fragment error, center position error, scale ratio error. error represents one aspect stability. furthermore, demonstrate stability metric low correlation accuracy metric. thus, indeed captures different perspective quality. lastly, based metric, evaluate several existing methods video detection show affect accuracy stability. believe work provide guidance solid baselines future researches related areas.",4 "systematic analysis state-of-the-art 3d lung nodule proposals generation. lung nodule proposals generation primary step lung nodule detection received much attention recent years . paper, first construct model 3-dimension convolutional neural network (3d cnn) generate lung nodule proposals, achieve state-of-the-art performance. then, analyze series key problems concerning training performance efficiency. firstly, train 3d cnn model data different resolutions find models trained high resolution input data achieve better lung nodule proposals generation performances especially nodules small sizes, consumes much memory time. then, analyze memory consumptions different platforms experimental results indicate cpu architecture provide us larger memory enables us explore possibilities 3d applications. implement 3d cnn model cpu platform propose intel extended-caffe framework supports many highly-efficient 3d computations, opened source https://github.com/extendedcaffe/extended-caffe.",4 "improving statistical machine translation resource-poor language using related resource-rich languages. propose novel language-independent approach improving machine translation resource-poor languages exploiting similarity resource-rich ones. precisely, improve translation resource-poor source language x_1 resource-rich language given bi-text containing limited number parallel sentences x_1-y larger bi-text x_2-y resource-rich language x_2 closely related x_1. achieved taking advantage opportunities vocabulary overlap similarities languages x_1 x_2 spelling, word order, syntax offer: (1) improve word alignments resource-poor language, (2) augment additional translation options, (3) take care potential spelling differences appropriate transliteration. evaluation indonesian- >english using malay spanish -> english using portuguese pretending spanish resource-poor shows absolute gain 1.35 3.37 bleu points, respectively, improvement best rivaling approaches, using much less additional data. overall, method cuts amount necessary ""real training data factor 2--5.",4 "thoracic disease identification localization limited supervision. accurate identification localization abnormalities radiology images play integral part clinical diagnosis treatment planning. building highly accurate prediction model tasks usually requires large number images manually annotated labels finding sites abnormalities. reality, however, annotated data expensive acquire, especially ones location annotations. need methods work well small amount location annotations. address challenge, present unified approach simultaneously performs disease identification localization underlying model images. demonstrate approach effectively leverage class information well limited location annotation, significantly outperforms comparative reference baseline classification localization tasks.",4 "union intersections (uoi) interpretable data driven discovery prediction. increasing size complexity scientific data could dramatically enhance discovery prediction basic scientific applications. realizing potential, however, requires novel statistical analysis methods interpretable predictive. introduce union intersections (uoi), flexible, modular, scalable framework enhanced model selection estimation. methods based uoi perform model selection model estimation intersection union operations, respectively. show uoi-based methods achieve low-variance nearly unbiased estimation small number interpretable features, maintaining high-quality prediction accuracy. perform extensive numerical investigation evaluate uoi algorithm ($uoi_{lasso}$) synthetic real data. so, demonstrate extraction interpretable functional networks human electrophysiology recordings well accurate prediction phenotypes genotype-phenotype data reduced features. also show (with $uoi_{l1logistic}$ $uoi_{cur}$ variants basic framework) improved prediction parsimony classification matrix factorization several benchmark biomedical data sets. results suggest methods based uoi framework could improve interpretation prediction data-driven discovery across scientific fields.",19 "learning attend deep architectures image tracking. discuss attentional model simultaneous object tracking recognition driven gaze data. motivated theories perception, model consists two interacting pathways: identity control, intended mirror pathways neuroscience models. identity pathway models object appearance performs classification using deep (factored)-restricted boltzmann machines. point time observations consist foveated images, decaying resolution toward periphery gaze. control pathway models location, orientation, scale speed attended object. posterior distribution states estimated particle filtering. deeper control pathway, encounter attentional mechanism learns select gazes minimize tracking uncertainty. unlike previous work, introduce gaze selection strategies operate presence partial information continuous action space. show straightforward extension existing approach partial information setting results poor performance, propose alternative method based modeling reward surface gaussian process. approach gives good performance presence partial information allows us expand action space small, discrete set fixation points continuous domain.",4 "learning explain non-standard english words phrases. describe data-driven approach automatically explaining new, non-standard english expressions given sentence, building large dataset includes 15 years crowdsourced examples urbandictionary.com. unlike prior studies focus matching keywords slang dictionary, investigate possibility learning neural sequence-to-sequence model generates explanations unseen non-standard english expressions given context. propose dual encoder approach---a word-level encoder learns representation context, second character-level encoder learn hidden representation target non-standard expression. model produce reasonable definitions new non-standard english expressions given context certain confidence.",4 "interactive visual data exploration subjective feedback: information-theoretic approach. visual exploration high-dimensional real-valued datasets fundamental task exploratory data analysis (eda). existing methods use predefined criteria choose representation data. lack methods (i) elicit user learned data (ii) show patterns know yet. construct theoretical model identified patterns input knowledge system. knowledge syntax intuitive, ""this set points forms cluster"", requires knowledge maths. background knowledge used find maximum entropy distribution data, system provides user data projections data maximum entropy distribution differ most, hence showing user aspects data maximally informative given user's current knowledge. provide open source eda system tailored interactive visualizations demonstrate concepts. study performance system present use cases synthetic real data. find model prototype system allow user learn information efficiently various data sources system works sufficiently fast practice. conclude information theoretic approach exploratory data analysis patterns observed user formalized constraints provides principled, intuitive, efficient basis constructing eda system.",19 "universal consistency minimax rates online mondrian forests. establish consistency algorithm mondrian forests, randomized classification algorithm implemented online. first, amend original mondrian forest algorithm, considers fixed lifetime parameter. indeed, fact parameter fixed hinders statistical consistency original procedure. modified mondrian forest algorithm grows trees increasing lifetime parameters $\lambda_n$, uses alternative updating rule, allowing work also online fashion. second, provide theoretical analysis establishing simple conditions consistency. theoretical analysis also exhibits surprising fact: algorithm achieves minimax rate (optimal rate) estimation lipschitz regression function, strong extension previous results arbitrary dimension.",19 "iterative school decomposition algorithm solving multi-school bus routing scheduling problem. servicing school transportation demand safely minimum number buses one highest financial goals school transportation directors. achieve objective, good efficient way solve routing scheduling problem required. due growth computing power, spotlight shed solving combined problem school bus routing scheduling. recent attempt tried model routing problem maximizing trip compatibilities hope requiring fewer buses scheduling problem. however, over-counting problem associated trip compatibility could diminish performance approach. extended model proposed paper resolve issue along iterative solution algorithm. extended model integrated model multi-school bus routing scheduling problem. result shows better solutions 8 test problems found fewer number buses (up 25%) shorter travel time (up 7% per trip).",12 "nonparametric metadata dependent relational model. introduce nonparametric metadata dependent relational (nmdr) model, bayesian nonparametric stochastic block model network data. nmdr allows entities associated node mixed membership unbounded collection latent communities. learned regression models allow memberships depend on, predicted from, arbitrary node metadata. develop efficient mcmc algorithms learning nmdr models partially observed node relationships. retrospective mcmc methods allow sampler work directly infinite stick-breaking representation nmdr, avoiding need finite truncations. results demonstrate recovery useful latent communities real-world social ecological networks, usefulness metadata link prediction tasks.",4 "unsupervised learning object landmarks factorized spatial embeddings. learning automatically structure object categories remains important open problem computer vision. paper, propose novel unsupervised approach discover learn landmarks object categories, thus characterizing structure. approach based factorizing image deformations, induced viewpoint change object deformation, learning deep neural network detects landmarks consistently visual effects. furthermore, show learned landmarks establish meaningful correspondences different object instances category without impose requirement explicitly. assess method qualitatively variety object types, natural man-made. also show unsupervised landmarks highly predictive manually-annotated landmarks face benchmark datasets, used regress high degree accuracy.",4 "randomized nonmonotone block proximal gradient method class structured nonlinear programming. propose randomized nonmonotone block proximal gradient (rnbpg) method minimizing sum smooth (possibly nonconvex) function block-separable (possibly nonconvex nonsmooth) function. iteration, method randomly picks block according prescribed probability distribution solves typically several associated proximal subproblems usually closed-form solution, certain progress objective value achieved. contrast usual randomized block coordinate descent method [23,20], method nonmonotone flavor uses variable stepsizes partially utilize local curvature information smooth component objective function. show accumulation point solution sequence method stationary point problem {\it almost surely} method capable finding approximate stationary point high probability. also establish sublinear rate convergence method terms minimal expected squared norm certain proximal gradients iterations. problem consideration convex, show expected objective values generated rnbpg converge optimal value problem. assumptions, establish sublinear linear rate convergence expected objective values generated monotone version rnbpg. finally, conduct preliminary experiments test performance rnbpg $\ell_1$-regularized least-squares problem dual svm problem machine learning. computational results demonstrate method substantially outperforms randomized block coordinate {\it descent} method fixed variable stepsizes.",12 "applying evolutionary optimisation robot obstacle avoidance. paper presents artificial evolutionbased method stereo image analysis application real-time obstacle detection avoidance mobile robot. uses parisian approach, consists splitting representation robot's environment large number simple primitives, ""flies"", evolved following biologically inspired scheme give fast, low-cost solution obstacle detection problem mobile robotics.",4 "hebbian/anti-hebbian network derived online non-negative matrix factorization cluster discover sparse features. despite extensive knowledge biophysical properties neurons, commonly accepted algorithmic theory neuronal function. explore hypothesis single-layer neuronal networks perform online symmetric nonnegative matrix factorization (snmf) similarity matrix streamed data. starting snmf cost function derive online algorithm, implemented biologically plausible network local learning rules. demonstrate network performs soft clustering data well sparse feature discovery. derived algorithm replicates many known aspects sensory anatomy biophysical properties neurons including unipolar nature neuronal activity synaptic weights, local synaptic plasticity rules dependence learning rate cumulative neuronal activity. thus, make step towards algorithmic theory neuronal function, facilitate large-scale neural circuit simulations biologically inspired artificial intelligence.",16 "sequential changepoint approach online community detection. present new algorithms detecting emergence community large networks sequential observations. networks modeled using erdos-renyi random graphs edges forming nodes community higher probability. based statistical changepoint detection methodology, develop three algorithms: exhaustive search (es), mixture, hierarchical mixture (h-mix) methods. performance methods evaluated average run length (arl), captures frequency false alarms, detection delay. numerical comparisons show es method performs best; however, exponentially complex. mixture method polynomially complex exploiting fact size community typically small large network. however, may react group active edges form community. issue resolved h-mix method, based dendrogram decomposition network. present asymptotic analytical expression arl mixture method threshold large. numerical simulation verifies approximation accurate even non-asymptotic regime. hence, used determine desired threshold efficiently. finally, numerical examples show mixture h-mix methods detect community quickly lower complexity es method.",19 "query-focused opinion summarization user-generated content. present submodular function-based framework query-focused opinion summarization. within framework, relevance ordering produced statistical ranker, information coverage respect topic distribution diverse viewpoints encoded submodular functions. dispersion functions utilized minimize redundancy. first evaluate different metrics text similarity submodularity-based summarization methods. experimenting community qa blog summarization, show system outperforms state-of-the-art approaches automatic evaluation human evaluation. human evaluation task conducted amazon mechanical turk scale, shows systems able generate summaries high overall quality information diversity.",4 "creating capsule wardrobes fashion images. propose automatically create capsule wardrobes. given inventory candidate garments accessories, algorithm must assemble minimal set items provides maximal mix-and-match outfits. pose task subset selection problem. permit efficient subset selection space outfit combinations, develop submodular objective functions capturing key ingredients visual compatibility, versatility, user-specific preference. since adding garments capsule expands possible outfits, devise iterative approach allow near-optimal submodular function maximization. finally, present unsupervised approach learn visual compatibility ""in wild"" full body outfit photos; compatibility metric translates well cleaner catalog photos improves existing methods. results thousands pieces popular fashion websites show automatic capsule creation potential mimic skilled fashionistas assembling flexible wardrobes, significantly scalable.",4 "learning $\ell^{0}$-graph: $\ell^{0}$-induced sparse subspace clustering. sparse subspace clustering methods, sparse subspace clustering (ssc) \cite{elhamifarv13} $\ell^{1}$-graph \cite{yanw09,chengyyfh10}, effective partitioning data lie union subspaces. methods use $\ell^{1}$-norm $\ell^{2}$-norm thresholding impose sparsity constructed sparse similarity graph, certain assumptions, e.g. independence disjointness, subspaces required obtain subspace-sparse representation, key success. assumptions guaranteed hold practice limit application sparse subspace clustering subspaces general location. paper, propose new sparse subspace clustering method named $\ell^{0}$-graph. contrast required assumptions subspaces existing sparse subspace clustering methods, proved subspace-sparse representation obtained $\ell^{0}$-graph arbitrary distinct underlying subspaces almost surely mild i.i.d. assumption data generation. develop proximal method obtain sub-optimal solution optimization problem $\ell^{0}$-graph proved guarantee convergence. moreover, propose regularized $\ell^{0}$-graph encourages nearby data similar neighbors similarity graph aligned within cluster graph connectivity issue alleviated. extensive experimental results various data sets demonstrate superiority $\ell^{0}$-graph compared competing clustering methods, well effectiveness regularized $\ell^{0}$-graph.",4 "robust video object tracking using particle filter likelihood based feature fusion adaptive template updating. robust algorithm solution proposed tracking object complex video scenes. solution, bootstrap particle filter (pf) initialized object detector, models time-evolving background video signal adaptive gaussian mixture. motion object expressed markov model, defines state transition prior. color texture features used represent object, marginal likelihood based feature fusion approach proposed. corresponding object template model updating procedure developed account possible scale changes object tracking process. experimental results show algorithm beats several existing alternatives tackling challenging scenarios video tracking tasks.",4 "exploring coevolution predator prey morphology behavior. common idiom biology education states, ""eyes front, animal hunts. eyes side, animal hides."" paper, explore one possible explanation predators tend forward-facing, high-acuity visual systems. using agent-based computational model evolution, predators prey interact adapt behavior morphology one another successive generations evolution. model, observe coevolutionary cycle prey swarming behavior predator's visual system, predator prey continually adapt visual system behavior, respectively, evolutionary time reaction one another due well-known ""predator confusion effect."" furthermore, provide evidence predator visual system drives coevolutionary cycle, suggest cycle could closed predator evolves hybrid visual system capable narrow, high-acuity vision tracking prey well broad, coarse vision prey discovery. thus, conflicting demands imposed predator's visual system predator confusion effect could led evolution complex eyes many predators.",16 "novel frank-wolfe algorithm. analysis applications large-scale svm training. recently, renewed interest machine learning community variants sparse greedy approximation procedure concave optimization known {the frank-wolfe (fw) method}. particular, procedure successfully applied train large-scale instances non-linear support vector machines (svms). specializing fw svm training allowed obtain efficient algorithms also important theoretical results, including convergence analysis training algorithms new characterizations model sparsity. paper, present analyze novel variant fw method based new way perform away steps, classic strategy used accelerate convergence basic fw procedure. formulation analysis focused general concave maximization problem simplex. however, specialization algorithm quadratic forms strongly related classic methods computational geometry, namely gilbert mdm algorithms. theoretical side, demonstrate method matches guarantees terms convergence rate number iterations obtained using classic away steps. particular, method enjoys linear rate convergence, result recently proved mdm quadratic forms. practical side, provide experiments several classification datasets, evaluate results using statistical tests. experiments show method faster fw method classic away steps, works well even cases classic away steps slow algorithm. furthermore, improvements obtained without sacrificing predictive accuracy obtained svm model.",4 "discrete geodesic calculus space viscous fluidic objects. based local approximation riemannian distance manifold computationally cheap dissimilarity measure, time discrete geodesic calculus developed, applications shape space explored. dissimilarity measure derived deformation energy whose hessian reproduces underlying riemannian metric, used define length energy discrete paths shape space. notion discrete geodesics defined energy minimizing paths gives rise discrete logarithmic map, variational definition discrete exponential map, time discrete parallel transport. new concept applied shape space shapes considered boundary contours physical objects consisting viscous material. flexibility computational efficiency approach demonstrated topology preserving shape morphing, representation paths shape space via local shape variations path generators, shape extrapolation via discrete geodesic flow, transfer geometric features.",12 linear learning sparse data. linear predictors especially useful data high-dimensional sparse. one standard techniques used train linear predictor averaged stochastic gradient descent (asgd) algorithm. present efficient implementation asgd avoids dense vector operations. also describe translation invariant extension called centered averaged stochastic gradient descent (casgd).,4 "generalized additive model selection. introduce gamsel (generalized additive model selection), penalized likelihood approach fitting sparse generalized additive models high dimension. method interpolates null, linear additive models allowing effect variable estimated either zero, linear, low-complexity curve, determined data. present blockwise coordinate descent procedure efficiently optimizing penalized likelihood objective dense grid tuning parameter, producing regularization path additive models. demonstrate performance method real simulated data examples, compare existing techniques additive model selection.",19 "reverse hex solver. present solrex,an automated solver game reverse hex.reverse hex, also known rex, misere hex, variant game hex player joins two sides loses game. solrex performs mini-max search state space using scalable parallel depth first proof number search, enhanced pruning inferior moves early detection certain winning strategies. solrex implemented code base hex program solver, solve arbitrary positions board sizes 6x6, hardest position taking less four hours four threads.",4 memory capacity random neural network. paper considers problem information capacity random neural network. network represented matrices square symmetrical. matrices weight determines highest lowest possible value found matrix. examined matrices randomly generated analyzed computer program. find surprising result capacity network maximum binary random neural network change number quantization levels associated weights increases.,4 "support vector machine classification indefinite kernels. propose method support vector machine classification using indefinite kernels. instead directly minimizing stabilizing nonconvex loss function, algorithm simultaneously computes support vectors proxy kernel matrix used forming loss. interpreted penalized kernel learning problem indefinite kernel matrices treated noisy observations true mercer kernel. formulation keeps problem convex relatively large problems solved efficiently using projected gradient analytic center cutting plane methods. compare performance technique methods several classic data sets.",4 "reinforcement imitation learning via interactive no-regret learning. recent work demonstrated problems-- particularly imitation learning structured prediction-- learner's predictions influence input-distribution tested naturally addressed interactive approach analyzed using no-regret online learning. approaches imitation learning, however, neither require benefit information cost actions. extend existing results two directions: first, develop interactive imitation learning approach leverages cost information; second, extend technique address reinforcement learning. results provide theoretical support commonly observed successes online approximate policy iteration. approach suggests broad new family algorithms provides unifying view existing techniques imitation reinforcement learning.",4 "hybrid genetic algorithm cloud computing applications. paper aid genetic algorithm fuzzy theory, present hybrid job scheduling approach, considers load balancing system reduces total execution time execution cost. try modify standard genetic algorithm reduce iteration creating population aid fuzzy theory. main goal research assign jobs resources considering vm mips length jobs. new algorithm assigns jobs resources considering job length resources capacities. evaluate performance approach famous cloud scheduling models. results experiments show efficiency proposed approach term execution time, execution cost average degree imbalance (di).",4 "symbolic approach reasoning linguistic quantifiers. paper investigates possibility performing automated reasoning probabilistic logic probabilities expressed means linguistic quantifiers. linguistic term expressed prescribed interval proportions. instead propagating numbers, qualitative terms propagated accordance numerical interpretation terms. quantified syllogism, modelling chaining probabilistic rules, studied context. shown qualitative counterpart syllogism makes sense, relatively independent threshold defining linguistically meaningful intervals, provided threshold values remain accordance intuition. inference power less full-fledged probabilistic con-quaint propagation device better corresponds could thought commonsense probabilistic reasoning.",4 "bayesian additive adaptive basis tensor product models modeling high dimensional surfaces: application high-throughput toxicity testing. many modern data sets sampled error complex high-dimensional surfaces. methods tensor product splines gaussian processes effective/well suited characterizing surface two three dimensions may suffer difficulties representing higher dimensional surfaces. motivated high throughput toxicity testing observed dose-response curves cross sections surface defined chemical's structural properties, model developed characterize surface predict untested chemicals' dose-responses. manuscript proposes novel approach models multidimensional surface sum learned basis functions formed tensor product lower dimensional functions, representable basis expansion learned data. model described, gibbs sampling algorithm proposed, investigated simulation study well data taken us epa's toxcast high throughput toxicity testing platform.",19 "long-range fractal correlations literary corpora. paper analyse fractal structure long human-language records mapping large samples texts onto time series. particular mapping set work inspired linguistic basis sense retains {\em word} fundamental unit communication. results confirm beyond short-range correlations resulting syntactic rules acting sentence level, long-range structures emerge large written language samples give rise long-range correlations use words.",3 "structured low-rank matrix factorization: global optimality, algorithms, applications. recently, convex formulations low-rank matrix factorization problems received considerable attention machine learning. however, formulations often require solving matrix size data matrix, making challenging apply large scale datasets. moreover, many applications data display structures beyond simply low-rank, e.g., images videos present complex spatio-temporal structures largely ignored standard low-rank methods. paper study matrix factorization technique suitable large datasets captures additional structure factors using particular form regularization includes well-known regularizers total variation nuclear norm particular cases. although resulting optimization problem non-convex, show size factors large enough, certain conditions, local minimizer factors yields global minimizer. practical algorithms also provided solve matrix factorization problem, bounds distance given approximate solution optimization problem global optimum derived. examples neural calcium imaging video segmentation hyperspectral compressed recovery show advantages approach high-dimensional datasets.",4 "multi-scale multi-band densenets audio source separation. paper deals problem audio source separation. handle complex ill-posed nature problems audio source separation, current state-of-the-art approaches employ deep neural networks obtain instrumental spectra mixture. study, propose novel network architecture extends recently developed densely connected convolutional network (densenet), shown excellent results image classification tasks. deal specific problem audio source separation, up-sampling layer, block skip connection band-dedicated dense blocks incorporated top densenet. proposed approach takes advantage long contextual information outperforms state-of-the-art results sisec 2016 competition large margin terms signal-to-distortion ratio. moreover, proposed architecture requires significantly fewer parameters considerably less training time compared methods.",4 "cell assemblies multiple time scales arbitrary lag constellations. hebb's idea cell assembly fundamental unit neural information processing dominated neuroscience like theoretical concept within past 60 years. range different physiological phenomena, precisely synchronized spiking broadly simultaneous rate increases, subsumed term. yet progress area hampered lack statistical tools would enable extract assemblies arbitrary constellations time lags, multiple temporal scales, partly due severe computational burden. present unifying methodological conceptual framework detects assembly structure many different time scales, levels precision, arbitrary internal organization. applying methodology multiple single unit recordings various cortical areas, find universal cortical coding scheme, assembly structure precision significantly depends brain area recorded ongoing task demands.",16 "stochastic weighted function norm regularization. deep neural networks (dnns) become increasingly important due excellent empirical performance wide range problems. however, regularization generally achieved indirect means, largely due complex set functions defined network difficulty measuring function complexity. exists method literature additive regularization based norm function, classically considered statistical learning theory. work, propose sampling-based approximations weighted function norms regularizers deep neural networks. provide, best knowledge, first proof literature np-hardness computing function norms dnns, motivating necessity stochastic optimization strategy. based proposed regularization scheme, stability-based bounds yield $\mathcal{o}(n^{-\frac{1}{2}})$ generalization error proposed regularizer applied convex function sets. demonstrate broad conditions convergence stochastic gradient descent objective, including non-convex function sets defined dnns. finally, empirically validate improved performance proposed regularization strategy convex function sets well dnns real-world classification segmentation tasks.",4 "deep structured learning approach towards automating connectome reconstruction 3d electron micrographs. present deep structured learning method neuron segmentation 3d electron microscopy (em) improves significantly upon state art terms accuracy scalability. method consists 3d u-net classifier predicting affinity graphs voxels, followed iterative region agglomeration. train u-net using new structured loss based malis encourages topological correctness. extension consists two parts: first, $o(n\log(n))$ method compute loss gradient, improving originally proposed $o(n^2)$ algorithm. second, compute gradient two separate passes avoid spurious contributions early training stages. affinity predictions accurate enough simple agglomeration outperforms involved methods used earlier inferior predictions. present results three datasets (cremi, fib, segem) different imaging techniques animals achieve improvements previous results 27%, 15%, 250%. findings suggest single 3d segmentation strategy applied isotropic anisotropic em data. runtime method scales $o(n)$ size volume achieves throughput 2.6 seconds per megavoxel, allowing processing large datasets.",4 "application threshold accepting metaheuristic curriculum based course timetabling. article presents local search approach solution timetabling problems general, particular implementation competition track 3 international timetabling competition 2007 (itc 2007). heuristic search procedure based threshold accepting overcome local optima. stochastic neighborhood proposed implemented, randomly removing reassigning events current solution. overall concept incrementally obtained series experiments, describe (sub)section paper. result, successfully derived potential candidate solution approach finals track 3 itc 2007.",4 "learning spectral-spatial-temporal features via recurrent convolutional neural network change detection multispectral imagery. change detection one central problems earth observation extensively investigated recent decades. paper, propose novel recurrent convolutional neural network (recnn) architecture, trained learn joint spectral-spatial-temporal feature representation unified framework change detection multispectral images. end, bring together convolutional neural network (cnn) recurrent neural network (rnn) one end-to-end network. former able generate rich spectral-spatial feature representations, latter effectively analyzes temporal dependency bi-temporal images. comparison previous approaches change detection, proposed network architecture possesses three distinctive properties: 1) end-to-end trainable, contrast existing methods whose components separately trained computed; 2) naturally harnesses spatial information proven beneficial change detection task; 3) capable adaptively learning temporal dependency multitemporal images, unlike algorithms use fairly simple operation like image differencing stacking. far know, first time recurrent convolutional network architecture proposed multitemporal remote sensing image analysis. proposed network validated real multispectral data sets. visual quantitative analysis experimental results demonstrates competitive performance proposed mode.",4 lambek-grishin calculus np-complete. lambek-grishin calculus lg symmetric extension non-associative lambek calculus nl. paper prove derivability problem lg np-complete.,4 "tree-structured boosting: connections gradient boosted stumps full decision trees. additive models, produced gradient boosting, full interaction models, classification regression trees (cart), widely used algorithms investigated largely isolation. show models exist along spectrum, revealing never-before-known connections two approaches. paper introduces novel technique called tree-structured boosting creating single decision tree, shows method produce models equivalent cart gradient boosted stumps extremes varying single parameter. although tree-structured boosting designed primarily provide model interpretability predictive performance needed high-stake applications like medicine, also produce decision trees represented hybrid models cart boosted stumps outperform either approaches.",19 "two-step fusion process multi-criteria decision applied natural hazards mountains. mountain river torrents snow avalanches generate human material damages dramatic consequences. knowledge natural phenomenona often lacking expertise required decision risk management purposes using multi-disciplinary quantitative qualitative approaches. expertise considered decision process based imperfect information coming less reliable conflicting sources. methodology mixing analytic hierarchy process (ahp), multi-criteria aid-decision method, information fusion using belief function theory described. fuzzy sets possibilities theories allow transform quantitative qualitative criteria common frame discernment decision dempster-shafer theory (dst ) dezert-smarandache theory (dsmt) contexts. main issues consist basic belief assignments elicitation, conflict identification management, fusion rule choices, results validation also specific needs make difference importance reliability uncertainty fusion process.",4 "properties n-dimensional convolution image deconvolution. convolution system linear time invariant, describe optical imaging process. based convolution system, many deconvolution techniques developed optical image analysis, boosting space resolution optical images, image denoising, image enhancement on. here, gave properties n-dimensional convolution. using properties, proposed image deconvolution method. method uses series convolution operations deconvolute image. demonstrated method similar deconvolution results state-of-art method. core calculation proposed method image convolution, thus method easily integrated gpu mode large-scale image deconvolution.",6 "graphconnect: regularization framework neural networks. deep neural networks proved successful domains large training sets available, number training samples small, performance suffers overfitting. prior methods reducing overfitting weight decay, dropout dropconnect data-independent. paper proposes new method, graphconnect, data-dependent, motivated observation data interest lie close manifold. new method encourages relationships learned decisions resemble graph representing manifold structure. essentially graphconnect designed learn attributes present data samples contrast weight decay, dropout dropconnect simply designed make difficult fit random error noise. empirical rademacher complexity used connect generalization error neural network spectral properties graph learned input data. framework used show graphconnect superior weight decay. experimental results several benchmark datasets validate theoretical analysis, show number training samples small, graphconnect able significantly improve performance weight decay.",4 "impact cognitive radio future management spectrum. cognitive radio breakthrough technology expected profound impact way radio spectrum accessed, managed shared future. paper examine implications cognitive radio future management spectrum. near-term view involving opportunistic spectrum access model longer-term view involving self-regulating dynamic spectrum access model within society cognitive radios discussed.",4 "unified convex surrogate schatten-$p$ norm. schatten-$p$ norm ($00$ satisfying $1/p=1/p_1+1/p_2$, equivalence schatten-$p$ norm one matrix schatten-$p_1$ schatten-$p_2$ norms two factor matrices. extend equivalence multiple factor matrices show factor norms convex smooth $p>0$. contrast, original schatten-$p$ norm $01$, genetic algorithm operates quasispecies regime: advantageous mutant invades positive fraction population probability larger constant $p^*$ (which depend $m$). estimate next probability occurrence catastrophe (the whole population falls fitness level previously reached positive fraction population). asymptotic results suggest following rules: $\pi=\sigma(1-p_c)(1-p_m)^\ell$ slightly larger $1$; $p_m$ order $1/\ell$; $m$ larger $\ell\ln\ell$; running time exponential order $m$. first condition requires $ \ell p_m +p_c< \ln\sigma$. conclusions must taken great care: come asymptotic regime, formidable task understand relevance regime real-world problem. least, hope conclusions provide interesting guidelines practical implementation simple genetic algorithm.",12 "mean deviation similarity index: efficient reliable full-reference image quality evaluator. applications perceptual image quality assessment (iqa) image video processing, image acquisition, image compression, image restoration multimedia communication, led development many iqa metrics. paper, reliable full reference iqa model proposed utilize gradient similarity (gs), chromaticity similarity (cs), deviation pooling (dp). considering shortcomings commonly used gs model human visual system (hvs), new gs proposed fusion technique likely follow hvs. propose efficient effective formulation calculate joint similarity map two chromatic channels purpose measuring color changes. comparison commonly used formulation literature, proposed cs map shown efficient provide comparable better quality predictions. motivated recent work utilizes standard deviation pooling, general formulation dp presented paper used compute final score proposed gs cs maps. proposed formulation dp benefits minkowski pooling proposed power pooling well. experimental results six datasets natural images, synthetic dataset, digitally retouched dataset show proposed index provides comparable better quality predictions recent competing state-of-the-art iqa metrics literature, reliable low complexity. matlab source code proposed metric available https://www.mathworks.com/matlabcentral/fileexchange/59809.",4 "dense rgb-d semantic mapping pixel-voxel neural network. intelligent robotics applications, extending 3d mapping 3d semantic mapping enables robots to, localize respect scene's geometrical features also simultaneously understand higher level meaning scene contexts. previous methods focus geometric 3d reconstruction scene understanding independently notwithstanding fact joint estimation boost accuracy semantic mapping. paper, dense rgb-d semantic mapping system pixel-voxel network proposed, perform dense 3d mapping simultaneously recognizing semantically labelling point 3d map. proposed pixel-voxel network obtains global context information using pixelnet exploit rgb image meanwhile, preserves accurate local shape information using voxelnet exploit corresponding 3d point cloud. unlike existing architecture fuses score maps different models equal weights, proposed softmax weighted fusion stack adaptively learns varying contributions pixelnet voxelnet, fuses score maps two models according respective confidence levels. proposed pixel-voxel network achieves state-of-the-art semantic segmentation performance sun rgb-d benchmark dataset. runtime proposed system boosted 11-12hz, enabling near real-time performance using i7 8-cores pc titan x gpu.",4 "seamless integration coordination cognitive skills humanoid robots: deep learning approach. study investigates adequate coordination among different cognitive processes humanoid robot developed end-to-end learning direct perception visuomotor stream. propose deep dynamic neural network model built dynamic vision network, motor generation network, higher-level network. proposed model designed process integrate direct perception dynamic visuomotor patterns hierarchical model characterized different spatial temporal constraints imposed level. conducted synthetic robotic experiments robot learned read human's intention observing gestures generate corresponding goal-directed actions. results verify proposed model able learn tutored skills generalize novel situations. model showed synergic coordination perception, action decision making, integrated coordinated set cognitive skills including visual perception, intention reading, attention switching, working memory, action preparation execution seamless manner. analysis reveals coherent internal representations emerged level hierarchy. higher-level representation reflecting actional intention developed means continuous integration lower-level visuo-proprioceptive stream.",4 "genetic algorithm (ga) feature selection crf based manipuri multiword expression (mwe) identification. paper deals identification multiword expressions (mwes) manipuri, highly agglutinative indian language. manipuri listed eight schedule indian constitution. mwe plays important role applications natural language processing(nlp) like machine translation, part speech tagging, information retrieval, question answering etc. feature selection important factor recognition manipuri mwes using conditional random field (crf). disadvantage manual selection choosing appropriate features running crf motivates us think genetic algorithm (ga). using ga able find optimal features run crf. tried fifty generations feature selection along three fold cross validation fitness function. model demonstrated recall (r) 64.08%, precision (p) 86.84% f-measure (f) 73.74%, showing improvement crf based manipuri mwe identification without ga application.",4 "towards reverse-engineering black-box neural networks. many deployed learned models black boxes: given input, returns output. internal information model, architecture, optimisation procedure, training data, disclosed explicitly might contain proprietary information make system vulnerable. work shows attributes neural networks exposed sequence queries. multiple implications. one hand, work exposes vulnerability black-box neural networks different types attacks -- show revealed internal information helps generate effective adversarial examples black box model. hand, technique used better protection private content automatic recognition models using adversarial examples. paper suggests actually hard draw line white box black box models.",19 "disguised face identification (dfi) facial keypoints using spatial fusion convolutional network. disguised face identification (dfi) extremely challenging problem due numerous variations introduced using different disguises. paper introduces deep learning framework first detect 14 facial key-points utilized perform disguised face identification. since training deep learning architectures relies large annotated datasets, two annotated facial key-points datasets introduced. effectiveness facial keypoint detection framework presented keypoint. superiority key-point detection framework also demonstrated comparison deep networks. effectiveness classification performance also demonstrated comparison state-of-the-art face disguise classification methods.",4 "saberlda: sparsity-aware learning topic models gpus. latent dirichlet allocation (lda) popular tool analyzing discrete count data text images. applications require lda handle large datasets large number topics. though distributed cpu systems used, gpu-based systems emerged promising alternative high computational power memory bandwidth gpus. however, existing gpu-based lda systems cannot support large number topics use algorithms dense data structures whose time space complexity linear number topics. paper, propose saberlda, gpu-based lda system implements sparsity-aware algorithm achieve sublinear time complexity scales well learn large number topics. address challenges introduced sparsity, propose novel data layout, new warp-based sampling kernel, efficient sparse count matrix updating algorithm improves locality, makes efficient utilization gpu warps, reduces memory consumption. experiments show saberlda learn billions-token-scale data 10,000 topics, almost two orders magnitude larger previous gpu-based systems. single gpu card, saberlda able learn 10,000 topics dataset billions tokens hours, achievable clusters tens machines before.",4 "nested hierarchical dirichlet processes. develop nested hierarchical dirichlet process (nhdp) hierarchical topic modeling. nhdp generalization nested chinese restaurant process (ncrp) allows word follow path topic node according document-specific distribution shared tree. alleviates rigid, single-path formulation ncrp, allowing document easily express thematic borrowings random effect. derive stochastic variational inference algorithm model, addition greedy subtree selection method document, allows efficient inference using massive collections text documents. demonstrate algorithm 1.8 million documents new york times 3.3 million documents wikipedia.",19 "investigating effects diversity mechanisms evolutionary algorithms dynamic environments. evolutionary algorithms successfully applied variety optimisation problems stationary environments. however, many real world optimisation problems set dynamic environments success criteria shifts regularly. population diversity affects algorithmic performance, particularly multiobjective dynamic problems. diversity mechanisms methods altering evolutionary algorithms way promotes maintenance population diversity. project intends measure compare performance effect variety diversity mechanisms evolutionary algorithm facing assortment dynamic problems.",4 "aspect-based opinion summarization convolutional neural networks. paper considers aspect-based opinion summarization (aos) reviews particular products. enable real applications, aos system needs address two core subtasks, aspect extraction sentiment classification. existing approaches aspect extraction, use linguistic analysis topic modeling, general across different products precise enough suitable particular products. instead take less general precise scheme, directly mapping review sentence pre-defined aspects. tackle aspect mapping sentiment classification, propose two convolutional neural network (cnn) based methods, cascaded cnn multitask cnn. cascaded cnn contains two levels convolutional networks. multiple cnns level 1 deal aspect mapping task, single cnn level 2 deals sentiment classification. multitask cnn also contains multiple aspect cnns sentiment cnn, different networks share word embeddings. experimental results indicate cascaded multitask cnns outperform svm-based methods large margins. multitask cnn generally performs better cascaded cnn.",4 "graphs machine learning: introduction. graphs commonly used characterise interactions objects interest. based straightforward formalism, used many scientific fields computer science historical sciences. paper, give introduction methods relying graphs learning. includes unsupervised supervised methods. unsupervised learning algorithms usually aim visualising graphs latent spaces and/or clustering nodes. focus extracting knowledge graph topologies. existing techniques applicable static graphs, edges evolve time, recent developments shown could extended deal evolving networks. supervised context, one generally aims inferring labels numerical values attached nodes using graph and, available, node characteristics. balancing two sources information challenging, especially disagree locally globally. contexts, supervised un-supervised, data relational (augmented one several global graphs) described above, graph valued. latter case, object interest given full graph (possibly completed characteristics). context, natural tasks include graph clustering (as producing clusters graphs rather clusters nodes single graph), graph classification, etc. 1 real networks one first practical studies graphs dated back original work moreno [51] 30s. since then, growing interest graph analysis associated strong developments modelling processing data. graphs used many scientific fields. biology [54, 2, 7], instance, metabolic networks describe pathways biochemical reactions [41], social sciences networks used represent relation ties actors [66, 56, 36, 34]. examples include powergrids [71] web [75]. recently, networks also considered areas geography [22] history [59, 39]. machine learning, networks seen powerful tools model problems order extract information data prediction purposes. object paper. complete surveys, refer [28, 62, 49, 45]. section, introduce notations highlight properties shared real networks. section 2, consider methods aiming extracting information unique network. particularly focus clustering methods goal find clusters vertices. finally, section 3, techniques take series networks account, network",19 "discriminative density-ratio estimation. covariate shift challenging problem supervised learning results discrepancy training test distributions. effective approach recently drew considerable attention research community reweight training samples minimize discrepancy. specific, many methods based developing density-ratio (dr) estimation techniques apply regression classification problems. although methods work well regression problems, performance classification problems satisfactory. due key observation methods focus matching sample marginal distributions without paying attention preserving separation classes reweighted space. paper, propose novel method discriminative density-ratio (ddr) estimation addresses aforementioned problem aims estimating density-ratio joint distributions class-wise manner. proposed algorithm iterative procedure alternates estimating class information test data estimating new density ratio class. incorporate estimated class information test data, soft matching technique proposed. addition, employ effective criterion adopts mutual information indicator stop iterative procedure resulting decision boundary lies sparse region. experiments synthetic benchmark datasets demonstrate superiority proposed method terms accuracy robustness.",4 "recognizing combinations facial action units different intensity using mixture hidden markov models neural network. facial action coding system consists 44 action units (aus) 7000 combinations. hidden markov models (hmms) classifier used successfully recognize facial action units (aus) expressions due ability deal au dynamics. however, separate hmm necessary single au au combination. since combinations au numbering thousands, efficient method needed. paper accurate real-time sequence-based system representation recognition facial aus presented. system following characteristics: 1) employing mixture hmms neural network, develop novel accurate classifier, deal au dynamics, recognize subtle changes, also robust intensity variations, 2) although use hmm single au only, employing neural network recognize single combination au, 3) using geometric appearance-based features, applying efficient dimension reduction techniques, system robust illumination changes represent temporal information involved formation facial expressions. extensive experiments cohn-kanade database show superiority proposed method, comparison classifiers. keywords: classifier design evaluation, data fusion, facial action units (aus), hidden markov models (hmms), neural network (nn).",4 "orthographic syllable basic unit smt related languages. explore use orthographic syllable, variable-length consonant-vowel sequence, basic unit translation related languages use abugida alphabetic scripts. show orthographic syllable level translation significantly outperforms models trained basic units (word, morpheme character) training small parallel corpora.",4 "geometric blind source separation method based facet component analysis. given set mixtures, blind source separation attempts retrieve source signals without little information mixing process. present geometric approach blind separation nonnegative linear mixtures termed {\em facet component analysis} (fca). approach based facet identification underlying cone structure data. earlier works focus recovering cone locating vertices (vertex component analysis vca) based mutual sparsity condition requires source signal possess stand-alone peak spectrum. formulate alternative conditions enough data points fall facets cone instead accumulating around vertices. find regime unique solvability, make use geometric density properties data points, develop efficient facet identification method combining data classification linear regression. noisy data, show denoising methods may employed, total variation technique imaging processing, principle component analysis. show computational results nuclear magnetic resonance spectroscopic data substantiate method.",12 "use non-stationary policies infinite-horizon discounted markov decision processes. consider infinite-horizon $\gamma$-discounted markov decision processes, known exists stationary optimal policy. consider algorithm value iteration sequence policies $\pi_1,...,\pi_k$ implicitely generates iteration $k$. provide performance bounds non-stationary policies involving last $m$ generated policies reduce state-of-the-art bound last stationary policy $\pi_k$ factor $\frac{1-\gamma}{1-\gamma^m}$. particular, use non-stationary policies allows reduce usual asymptotic performance bounds value iteration errors bounded $\epsilon$ iteration $\frac{\gamma}{(1-\gamma)^2}\epsilon$ $\frac{\gamma}{1-\gamma}\epsilon$, significant usual situation $\gamma$ close 1. given bellman operators computed error $\epsilon$, surprising consequence result problem ""computing approximately optimal non-stationary policy"" much simpler ""computing approximately optimal stationary policy"", even slightly simpler ""approximately computing value fixed policy"", since last problem guarantee $\frac{1}{1-\gamma}\epsilon$.",4 serious flaws korf et al.'s analysis time complexity a*. paper withdrawn.,4 "adaptive framework tune coordinate systems evolutionary algorithms. evolutionary computation research community, performance evolutionary algorithms (eas) depends strongly implemented coordinate system. however, commonly used coordinate system fixed well suited different function landscapes, eas thus might search efficiently. overcome shortcoming, paper propose framework, named acos, adaptively tune coordinate systems eas. acos, eigen coordinate system established making use cumulative population distribution information, obtained based covariance matrix adaptation strategy additional archiving mechanism. since population distribution information reflect features function landscape extent, eas eigen coordinate system capability identify modality function landscape. addition, eigen coordinate system coupled original coordinate system, selected according probability vector. probability vector aims determine selection ratio coordinate system individual, adaptively updated based collected information offspring. acos applied two popular ea paradigms, i.e., particle swarm optimization (pso) differential evolution (de), solving 30 test functions 30 50 dimensions 2014 ieee congress evolutionary computation. experimental studies demonstrate effectiveness.",4 "learning compose skills. present differentiable framework capable learning wide variety compositions simple policies call skills. recursively composing skills themselves, create hierarchies display complex behavior. skill networks trained generate skill-state embeddings provided inputs trainable composition function, turn outputs policy overall task. experiments environment consisting multiple collect evade tasks show architecture able quickly build complex skills simpler ones. furthermore, learned composition function displays transfer unseen combinations skills, allowing zero-shot generalizations.",4 "combinatorial pyramids discrete geometry energy-minimizing segmentation. paper defines basis new hierarchical framework segmentation algorithms based energy minimization schemes. new framework based two formal tools. first, combinatorial pyramid encode efficiently hierarchy partitions. secondly, discrete geometric estimators measure precisely important geometric parameters regions. measures combined photometrical topological features partition allows design energy terms based discrete measures. segmentation framework exploits energies build pyramid image partitions minimization scheme. experiments illustrating framework shown discussed.",4 "classification sparse overlapping groups. classification sparsity constraint solution plays central role many high dimensional machine learning applications. cases, features grouped together entire subsets features selected selected. many applications, however, restrictive. paper, interested less restrictive form structured sparse feature selection: assume features grouped according notion similarity, features group need selected task hand. groups comprised disjoint sets features, sometimes referred ""sparse group"" lasso, allows working richer class models traditional group lasso methods. framework generalizes conventional sparse group lasso allowing overlapping groups, additional flexiblity needed many applications one presents challenges. main contribution paper new procedure called sparse overlapping group (sog) lasso, convex optimization program automatically selects similar features classification high dimensions. establish model selection error bounds soglasso classification problems fairly general setting. particular, error bounds first results classification using sparse group lasso. furthermore, general soglasso bound specializes results lasso group lasso, known new. soglasso motivated multi-subject fmri studies functional activity classified using brain voxels features, source localization problems magnetoencephalography (meg), analyzing gene activation patterns microarray data analysis. experiments real synthetic data demonstrate advantages soglasso compared lasso group lasso.",4 "landcover fuzzy logic classification maximumlikelihood. present days remote sensing used application many sectors. remote sensing uses different images like multispectral, hyper spectral ultra spectral. remote sensing image classification one significant method classify image. state classify maximum likelihood classification fuzzy logic. experimenting fuzzy logic like spatial, spectral texture methods different sub methods used image classification.",4 "languages actions, formal grammars qualitive modeling companies. paper discuss methods using language actions, formal languages, grammars qualitative conceptual linguistic modeling companies technological human institutions. main problem following discussion problem find describe language structure external internal flow information companies. anticipate language structure external internal base flows determine structure companies. structure modeling abstract industrial company internal base flow information constructed certain flow words composed theoretical parts-processes-actions language. language procedures found external base flow information insurance company. formal stochastic grammar language procedures found statistical methods used understanding tendencies health care industry. present model human communications random walk semantic tree",4 "data granulation principles uncertainty. researches granular modeling produced variety mathematical models, intervals, (higher-order) fuzzy sets, rough sets, shadowed sets, suitable characterize so-called information granules. modeling input data uncertainty recognized crucial aspect information granulation. moreover, uncertainty well-studied concept many mathematical settings, probability theory, fuzzy set theory, possibility theory. fact suggests appropriate quantification uncertainty expressed information granule model could used define invariant property, exploited practical situations information granulation. perspective, procedure information granulation effective uncertainty conveyed synthesized information granule monotonically increasing relation uncertainty input data. paper, present data granulation framework elaborates principles uncertainty introduced klir. uncertainty mesoscopic descriptor systems data, possible apply principles regardless input data type specific mathematical setting adopted information granules. proposed framework conceived (i) offer guideline synthesis information granules (ii) build groundwork compare quantitatively judge different data granulation procedures. provide suitable case study, introduce new data granulation technique based minimum sum distances, designed generate type-2 fuzzy sets. analyze procedure performing different experiments two distinct data types: feature vectors labeled graphs. results show uncertainty input data suitably conveyed generated type-2 fuzzy set models.",4 "read-bad: new dataset evaluation scheme baseline detection archival documents. text line detection crucial application associated automatic text recognition keyword spotting. modern algorithms perform good well-established datasets since either comprise clean data simple/homogeneous page layouts. collected annotated 2036 archival document images different locations time periods. dataset contains varying page layouts degradations challenge text line segmentation methods. well established text line segmentation evaluation schemes detection rate recognition accuracy demand binarized data annotated pixel level. producing ground truth means laborious needed determine method's quality. paper propose new evaluation scheme based baselines. proposed scheme need binarization handle skewed well rotated text lines. icdar 2017 competition baseline detection icdar 2017 competition layout analysis challenging medieval manuscripts used evaluation scheme. finally, present results achieved recently published text line detection algorithm.",4 "modularity component analysis versus principal component analysis. paper exact linear relation leading eigenvectors modularity matrix singular vectors uncentered data matrix developed. based analysis concept modularity component defined, properties developed. shown modularity component analysis used cluster data similar traditional principal component analysis used except modularity component analysis require data centering.",19 "mining determinism human strategic behavior. work lies fusion experimental economics data mining. continues author's previous work mining behaviour rules human subjects experimental data, game-theoretic predictions partially fail work. game-theoretic predictions aka equilibria tend success experienced subjects specific games, rarely given. apart game theory, contemporary experimental economics offers number alternative models. relevant literature, models always biased psychological near-psychological theories claimed proven data. work introduces data mining approach problem without using vast psychological background. apart determinism, biases regarded. two datasets different human subject experiments taken evaluation. first one repeated mixed strategy zero sum game second - repeated ultimatum game. result, way mining deterministic regularities human strategic behaviour described evaluated. future work, design new representation formalism discussed.",4 "learning inverse mappings adversarial criterion. propose flipped-adversarial autoencoder (faae) simultaneously trains generative model g maps arbitrary latent code distribution data distribution encoder e embodies ""inverse mapping"" encodes data sample latent code vector. unlike previous hybrid approaches leverage adversarial training criterion constructing autoencoders, faae minimizes re-encoding errors latent space exploits adversarial criterion data space. experimental evaluations demonstrate proposed framework produces sharper reconstructed images time enabling inference captures rich semantic representation data.",4 "encoding monotonic multi-set preferences using ci-nets: preliminary report. cp-nets variants constitute one main ai approaches specifying reasoning preferences. ci-nets, particular, cp-inspired formalism representing ordinal preferences sets goods, typically required monotonic. considering also goods often come multi-sets rather sets, natural question whether ci-nets used less directly encode preferences multi-sets. provide initial ideas achieve this, sense least restricted form reasoning framework, call ""confined reasoning"", efficiently reduced reasoning ci-nets. framework nevertheless allows encoding preferences multi-sets unbounded multiplicities. also show extent used represent preferences multiplicites goods stated explicitly (""purely qualitative preferences"") well potential use generalization ci-nets component recent system evidence aggregation.",4 "probabilistic reasoning information compression multiple alignment, unification search: introduction overview. article introduces idea probabilistic reasoning (pr) may understood ""information compression multiple alignment, unification search"" (icmaus). context, multiple alignment meaning similar distinct meaning bio-informatics, unification means simple merging matching patterns, meaning related simpler meaning term logic. software model, sp61, developed discovery formation 'good' multiple alignments, evaluated terms information compression. model described outline. using examples sp61 model, article describes outline icmaus framework model various kinds pr including: pr best-match pattern recognition information retrieval; one-step 'deductive' 'abductive' pr; inheritance attributes class hierarchy; chains reasoning (probabilistic decision networks decision trees, pr 'rules'); geometric analogy problems; nonmonotonic reasoning reasoning default values; modelling function bayesian network.",4 "adaptive gril estimator diverging number parameters. consider problem variables selection estimation linear regression model situations number parameters diverges sample size. propose adaptive generalized ridge-lasso (\mbox{adagril}) extension adaptive elastic net. adagril incorporates information redundancy among correlated variables model selection estimation. combines strengths quadratic regularization adaptively weighted lasso shrinkage. paper, highlight grouped selection property adacnet method (one type adagril) equal correlation case. weak conditions, establish oracle property adagril ensures optimal large performance dimension high. consequently, achieves goals handling problem collinearity high dimension enjoys oracle property. moreover, show adagril estimator achieves sparsity inequality, i. e., bound terms number non-zero components 'true' regression coefficient. bound obtained similar weak restricted eigenvalue (re) condition used lasso. simulations studies show particular cases adagril outperform competitors.",19 "accelerating neural architecture search using performance prediction. methods neural network hyperparameter optimization meta-modeling computationally expensive due need train large number model configurations. paper, show standard frequentist regression models predict final performance partially trained model configurations using features based network architectures, hyperparameters, time-series validation performance data. empirically show performance prediction models much effective prominent bayesian counterparts, simpler implement, faster train. models predict final performance visual classification language modeling domains, effective predicting performance drastically varying model architectures, even generalize model classes. using prediction models, also propose early stopping method hyperparameter optimization meta-modeling, obtains speedup factor 6x hyperparameter optimization meta-modeling. finally, empirically show early stopping method seamlessly incorporated reinforcement learning-based architecture selection algorithms bandit based search methods. extensive experimentation, empirically show performance prediction models early stopping algorithm state-of-the-art terms prediction accuracy speedup achieved still identifying optimal model configurations.",4 "9-im topological operators qualitative spatial relations using 3d selective nef complexes logic rules bodies. paper presents method compute automatically topological relations using swrl rules. calculation rules based definition selective nef complexes nef polyhedra structure generated standard polyhedron. selective nef complexes data model providing set binary boolean operators union, difference, intersection symmetric difference, unary operators interior, closure boundary. work, operators used compute topological relations objects defined constraints 9 intersection model (9-im) egenhofer. help constraints, defined procedure compute topological relations nef polyhedra. topological relationships disjoint, meets, contains, inside, covers, coveredby, equals overlaps, defined top-level ontology specific semantic definition relation transitive, symmetric, asymmetric, functional, reflexive, irreflexive. results computation topological relationships stored owl-dl ontology allowing infer new relationships objects. addition, logic rules based semantic web rule language allows definition logic programs define topological relationships computed kind objects specific attributes. instance, ""building"" overlaps ""railway"" ""railstation"".",4 "handwritten digit recognition committee deep neural nets gpus. competitive mnist handwritten digit recognition benchmark long history broken records since 1998. recent substantial improvement others dates back 7 years (error rate 0.4%) . recently able significantly improve result, using graphics cards greatly speed training simple deep mlps, achieved 0.35%, outperforming previous complex methods. report another substantial improvement: 0.31% obtained using committee mlps.",4 "advanced mean field theory restricted boltzmann machine. learning restricted boltzmann machine typically hard due computation gradients log-likelihood function. describe network state statistics restricted boltzmann machine, develop advanced mean field theory based bethe approximation. theory provides efficient message passing based method evaluates partition function (free energy) also gradients without requiring statistical sampling. results compared obtained computationally expensive sampling based method.",3 "priors initial hyperparameters affect gaussian process regression models. hyperparameters gaussian process regression (gpr) model specified kernel often estimated data via maximum marginal likelihood. due non-convexity marginal likelihood respect hyperparameters, optimization may converge global maxima. common approach tackle issue use multiple starting points randomly selected specific prior distribution. result choice prior distribution may play vital role predictability approach. however, exists little research literature study impact prior distributions hyperparameter estimation performance gpr. paper, provide first empirical study problem using simulated real data experiments. consider different types priors initial values hyperparameters commonly used kernels investigate influence priors predictability gpr models. results reveal that, kernel chosen, different priors initial hyperparameters significant impact performance gpr prediction, despite estimates hyperparameters different true values cases.",19 "school bus routing maximizing trip compatibility. school bus planning usually divided routing scheduling due complexity solving concurrently. however, separation two steps may lead worse solutions higher overall costs solving together. finding minimal number trips routing problem, neglecting importance trip compatibility may increase number buses actually needed scheduling problem. paper proposes new formulation multi-school homogeneous fleet routing problem maximizes trip compatibility minimizing total travel time. incorporates trip compatibility scheduling problem routing problem. since problem inherently routing problem, finding good solution cumbersome. compare performance model traditional routing problems, generate eight mid-size data sets. importing generated trips routing problems bus scheduling (blocking) problem, shown proposed model uses 13% fewer buses common traditional routing models.",12 "unfolding partiality disjunctions stable model semantics. paper studies implementation methodology partial disjunctive stable models partiality disjunctions unfolded logic program implementation stable models normal (disjunction-free) programs used core inference engine. unfolding done two separate steps. firstly, shown partial stable models captured total stable models using simple linear modular program transformation. hence, reasoning tasks concerning partial stable models solved using implementation total stable models. disjunctive partial stable models lacking implementations become available translation handles also disjunctive case. secondly, shown total stable models disjunctive programs determined computing stable models normal programs. hence, implementation stable models normal programs used core engine implementing disjunctive programs. feasibility approach demonstrated constructing system computing stable models disjunctive programs using smodels system core engine. performance resulting system compared dlv state-of-the-art special purpose system disjunctive programs.",4 "learning reporting dynamics breaking news rumour detection social media. breaking news leads situations fast-paced reporting social media, producing kinds updates related news stories, albeit caveat early updates tend rumours, i.e., information unverified status time posting. flagging information unverified helpful avoid spread information may turn false. detection rumours also feed rumour tracking system ultimately determines veracity. paper introduce novel approach rumour detection learns sequential dynamics reporting breaking news social media detect rumours new stories. using twitter datasets collected five breaking news stories, experiment conditional random fields sequential classifier leverages context learnt event rumour detection, compare state-of-the-art rumour detection system well baselines. contrast existing work, classifier need observe tweets querying piece information deem rumour, instead detect rumours tweet alone exploiting context learnt event. classifier achieves competitive performance, beating state-of-the-art classifier relies querying tweets improved precision recall, well outperforming best baseline nearly 40% improvement terms f1 score. scale diversity experiments reinforces generalisability classifier.",4 similar elements metric labeling complete graphs. consider problem involves finding similar elements collection sets. problem motivated applications machine learning pattern recognition. formulate similar elements problem optimization give efficient approximation algorithm finds solution within factor 2 optimal. similar elements problem special case metric labeling problem also give efficient 2-approximation algorithm metric labeling problem complete graphs.,4 "early stage influenza detection twitter. influenza acute respiratory illness occurs virtually every year results substantial disease, death expense. detection influenza earliest stage would facilitate timely action could reduce spread illness. existing systems cdc eiss try collect diagnosis data, almost entirely manual, resulting two-week delays clinical data acquisition. twitter, popular microblogging service, provides us perfect source early-stage flu detection due real- time nature. example, flu breaks out, people get flu may post related tweets enables detection flu breakout promptly. paper, investigate real-time flu detection problem twitter data proposing flu markov network (flu-mn): spatio-temporal unsupervised bayesian algorithm based 4 phase markov network, trying identify flu breakout earliest stage. test model real twitter datasets united states along baselines multiple applications, real-time flu breakout detection, future epidemic phase prediction, influenza-like illness (ili) physician visits. experimental results show robustness effectiveness approach. build real time flu reporting system based proposed approach, hopeful would help government health organizations identifying flu outbreaks facilitating timely actions decrease unnecessary mortality.",4 "diffusion component analysis: unraveling functional topology biological networks. complex biological systems successfully modeled biochemical genetic interaction networks, typically gathered high-throughput (htp) data. networks used infer functional relationships genes proteins. using intuition topological role gene network relates biological function, local diffusion based ""guilt-by-association"" graph-theoretic methods success inferring gene functions. seek improve function prediction integrating diffusion-based methods novel dimensionality reduction technique overcome incomplete noisy nature network data. paper, introduce diffusion component analysis (dca), framework plugs diffusion model learns low-dimensional vector representation node encode topological properties network. proof concept, demonstrate dca's substantial improvement state-of-the-art diffusion-based approaches predicting protein function molecular interaction networks. moreover, dca framework integrate multiple networks heterogeneous sources, consisting genomic information, biochemical experiments resources, even improve function prediction. yet another layer performance gain achieved integrating dca framework support vector machines take node vector representations features. overall, dca framework provides novel representation nodes network used plug-in architecture machine learning algorithms decipher topological properties obtain novel insights interactomes.",16 "surrogate model assisted cooperative coevolution large scale optimization. shown cooperative coevolution (cc) effectively deal large scale optimization problems (lsops) divide-and-conquer strategy. however, performance severely restricted current context-vector-based sub-solution evaluation method since method needs access original high dimensional simulation model evaluating sub-solution thus requires many computation resources. alleviate issue, study proposes novel surrogate model assisted cooperative coevolution (sacc) framework. sacc constructs surrogate model sub-problem obtained via decomposition employs evaluate corresponding sub-solutions. original simulation model adopted reevaluate good sub-solutions selected surrogate models, real evaluated sub-solutions turn employed update surrogate models. means, computation cost could greatly reduced without significantly sacrificing evaluation quality. show efficiency sacc, study uses radial basis function (rbf) success-history based adaptive differential evolution (shade) surrogate model optimizer, respectively. rbf shade proved effective small medium scale problems. study first scales lsops 1000 dimensions sacc framework, tailored certain extent adapting characteristics lsop sacc. empirical studies ieee cec 2010 benchmark functions demonstrate sacc significantly enhances evaluation efficiency sub-solutions, even much fewer computation resource, resultant rbf-shade-sacc algorithm able find much better solutions traditional cc algorithms.",4 "mixing energy models genetic algorithms on-lattice protein structure prediction. protein structure prediction (psp) computationally challenging problem. challenge largely comes fact energy function needs minimised order obtain native structure given protein clearly known. high resolution 20x20 energy model could better capture behaviour actual energy function low resolution energy model hydrophobic polar. however, fine grained details high resolution interaction energy matrix often informative guiding search. contrast, low resolution energy model could effectively bias search towards certain promising directions. paper, develop genetic algorithm mainly uses high resolution energy model protein structure evaluation uses low resolution hp energy model focussing search towards exploring structures hydrophobic cores. experimentally show mixing energy models leads significant lower energy structures compared state-of-the-art results.",4 "human-grounded evaluation benchmark local explanations machine learning. order people able trust take advantage results advanced machine learning artificial intelligence solutions real decision making, people need able understand machine rationale given output. research explain artificial intelligence (xai) addresses aim, need evaluation human relevance understandability explanations. work contributes novel methodology evaluating quality human interpretability explanations machine learning models. present evaluation benchmark instance explanations text image classifiers. explanation meta-data benchmark generated user annotations image text samples. describe benchmark demonstrate utility quantitative evaluation explanations generated recent machine learning algorithm. research demonstrates human-grounded evaluation could used measure qualify local machine-learning explanations.",4 "anisotropic diffusion-based kernel matrix model face liveness detection. facial recognition verification widely used biometric technology security system. unfortunately, face biometrics vulnerable spoofing attacks using photographs videos. paper, present anisotropic diffusion-based kernel matrix model (adkmm) face liveness detection prevent face spoofing attacks. use anisotropic diffusion enhance edges boundary locations face image, kernel matrix model extract face image features call diffusion-kernel (d-k) features. d-k features reflect inner correlation face image sequence. introduce convolution neural networks extract deep features, then, employ generalized multiple kernel learning method fuse d-k features deep features achieve better performance. experimental evaluation two publicly available datasets shows proposed method outperforms state-of-art face liveness detection methods.",4 "learning hidden unit contributions unsupervised acoustic model adaptation. work presents broad study adaptation neural network acoustic models means learning hidden unit contributions (lhuc) -- method linearly re-combines hidden units speaker- environment-dependent manner using small amounts unsupervised adaptation data. also extend lhuc speaker adaptive training (sat) framework leads adaptable dnn acoustic model, working speaker-dependent speaker-independent manner, without requirements maintain auxiliary speaker-dependent feature extractors introduce significant speaker-dependent changes dnn structure. series experiments four different speech recognition benchmarks (ted talks, switchboard, ami meetings, aurora4) comprising 270 test speakers, show lhuc test-only sat variants results consistent word error rate reductions ranging 5% 23% relative depending task degree mismatch training test data. addition, investigated effect amount adaptation data per speaker, quality unsupervised adaptation targets, complementarity adaptation techniques, one-shot adaptation, extension adapting dnns trained sequence discriminative manner.",4 "polarimetric hierarchical semantic model scattering mechanism based polsar image classification. polarimetric sar (polsar) image classification, challenge classify aggregated terrain types, urban area, semantic homogenous regions due sharp bright-dark variations intensity. aggregated terrain type formulated similar ground objects aggregated together. paper, polarimetric hierarchical semantic model (phsm) firstly proposed overcome disadvantage based constructions primal-level middle-level semantic. primal-level semantic polarimetric sketch map consists sketch segments sparse representation polsar image. middle-level semantic region map extract semantic homogenous regions sketch map exploiting topological structure sketch segments. mapping region map polsar image, complex polsar scene partitioned aggregated, structural homogenous pixel-level subspaces characteristics relatively coherent terrain types subspace. then, according characteristics three subspaces above, three specific methods adopted, furthermore polarimetric information exploited improve segmentation result. experimental results polsar data sets different bands sensors demonstrate proposed method superior state-of-the-art methods region homogeneity edge preservation terrain classification.",4 "occurrence statistics entities, relations types web. problem collecting reliable estimates occurrence entities open web forms premise report. models learned tagging entities cannot expected perform well deployed web. owing severe mismatch distributions entities web relatively diminutive training data. report, build case maximum mean discrepancy estimation occurrence statistics entities web, taking review named entity disambiguation techniques related concepts along way.",4 turkish pos tagging reducing sparsity morpheme tags small datasets. sparsity one major problems natural language processing. problem becomes even severe agglutinating languages highly prone inflected. deal sparsity turkish adopting morphological features part-of-speech tagging. learn inflectional derivational morpheme tags turkish using conditional random fields (crf) employ morpheme tags part-of-speech (pos) tagging using hidden markov models (hmms) mitigate sparsity. results show using morpheme tags pos tagging helps alleviate sparsity emission probabilities. model outperforms hidden markov model based pos tagging models small training datasets turkish. obtain accuracy 94.1% morpheme tagging 89.2% pos tagging 5k training dataset.,4 "online influence maximization independent cascade model semi-bandit feedback. study stochastic online problem learning influence social network semi-bandit feedback, observe users influence other. problem combines challenges limited feedback, learning agent observes influenced portion network, combinatorial number actions, cardinality feasible set exponential maximum number influencers. propose computationally efficient ucb-like algorithm, imlinucb, analyze it. regret bounds polynomial quantities interest; reflect structure network probabilities influence. moreover, depend inherently large quantities, cardinality action set. best knowledge, first results. imlinucb permits linear generalization therefore suitable large-scale problems. experiments show regret imlinucb scales suggested upper bounds several representative graph topologies; based linear generalization, imlinucb significantly reduce regret real-world influence maximization semi-bandits.",4 "spatially transformed adversarial examples. recent studies show widely used deep neural networks (dnns) vulnerable carefully crafted adversarial examples. many advanced algorithms proposed generate adversarial examples leveraging $\mathcal{l}_p$ distance penalizing perturbations. researchers explored different defense methods defend adversarial attacks. effectiveness $\mathcal{l}_p$ distance metric perceptual quality remains active research area, paper instead focus different type perturbation, namely spatial transformation, opposed manipulating pixel values directly prior works. perturbations generated spatial transformation could result large $\mathcal{l}_p$ distance measures, extensive experiments show spatially transformed adversarial examples perceptually realistic difficult defend existing defense systems. potentially provides new direction adversarial example generation design corresponding defenses. visualize spatial transformation based perturbation different examples show technique produce realistic adversarial examples smooth image deformation. finally, visualize attention deep networks different types adversarial examples better understand examples interpreted.",4 "cooperative multi-agent planning: survey. cooperative multi-agent planning (map) relatively recent research field combines technologies, algorithms techniques developed artificial intelligence planning multi-agent systems communities. planning generally treated single-agent task, map generalizes concept considering multiple intelligent agents work cooperatively develop course action satisfies goals group. paper reviews relevant approaches map, putting focus solvers took part 2015 competition distributed multi-agent planning, classifies according key features relative performance.",4 "evaluation explore-exploit policies multi-result ranking systems. analyze problem using explore-exploit techniques improve precision multi-result ranking systems web search, query autocompletion news recommendation. adopting exploration policy directly online, without understanding impact production system, may unwanted consequences - system may sustain large losses, create user dissatisfaction, collect exploration data help improve ranking quality. offline framework thus necessary let us decide policy apply production environment ensure positive outcome. here, describe offline framework. using framework, study popular exploration policy - thompson sampling. show different ways implementing multi-result ranking systems, different semantic interpretation leading different results terms sustained click-through-rate (ctr) loss expected model improvement. particular, demonstrate thompson sampling act online learner optimizing ctr, cases lead interesting outcome: lift ctr exploration. observation important production systems suggests one get valuable exploration data improve ranking performance long run, time increase ctr exploration lasts.",4 "3d camouflaging object using rgb-d sensors. paper proposes new optical camouflage system uses rgb-d cameras, acquiring point cloud background scene, tracking observers eyes. system enables user conceal object located behind display surrounded 3d objects. considered tracked point observer eyes light source, system work estimating shadow shape display device falls objects background. system uses 3d observer eyes locations display corners predict shadow points nearest neighbors constructed point cloud background scene.",4 "pitman-yor diffusion trees. introduce pitman yor diffusion tree (pydt) hierarchical clustering, generalization dirichlet diffusion tree (neal, 2001) removes restriction binary branching structure. generative process described shown result exchangeable distribution data points. prove theoretical properties model present two inference methods: collapsed mcmc sampler allows us model uncertainty tree structures, computationally efficient greedy bayesian em search algorithm. algorithms use message passing tree structure. utility model algorithms demonstrated synthetic real world data, continuous binary.",19 "machine learning methods analyze arabidopsis thaliana plant root growth. one challenging problems biology classify plants based reaction genetic mutation. arabidopsis thaliana plant interesting, genetic structure similarities human beings. biologists classify type plant mutated mutated (wild) types. phenotypic analysis types time-consuming costly effort individuals. paper, propose modified feature extraction step using velocity acceleration root growth. second step, plant classification, employed different support vector machine (svm) kernels two hybrid systems neural networks. gated negative correlation learning (gncl) mixture negatively correlated experts (mnce) two ensemble methods based complementary feature classical classifiers; mixture expert (me) negative correlation learning (ncl). hybrid systems conserve advantages decrease effects disadvantages ncl me. experimental shows mnce gncl improve efficiency classical classifiers, however, svm kernels function better performance classifiers based neural network ensemble method. moreover, kernels consume less time obtain classification rate.",4 "continuous features discretization anomaly intrusion detectors generation. network security growing issue, evolution computer systems expansion attacks. biological systems inspiring scientists designs new adaptive solutions, genetic algorithms. paper, present approach uses genetic algorithm generate anomaly net- work intrusion detectors. paper, algorithm propose use discretization method continuous features selected intrusion detection, create homogeneity values, different data types. then,the intrusion detection system tested nsl-kdd data set using different distance methods. comparison held amongst results, shown end proposed approach good results, recommendations given future experiments.",4 "driven distraction: self-supervised distractor learning robust monocular visual odometry urban environments. present self-supervised approach ignoring ""distractors"" camera images purposes robustly estimating vehicle motion cluttered urban environments. leverage offline multi-session mapping approaches automatically generate per-pixel ephemerality mask depth map input image, use train deep convolutional network. run-time use predicted ephemerality depth input monocular visual odometry (vo) pipeline, using either sparse features dense photometric matching. approach yields metric-scale vo using single camera recover correct egomotion even 90% image obscured dynamic, independently moving objects. evaluate robust vo methods 400km driving oxford robotcar dataset demonstrate reduced odometry drift significantly improved egomotion estimation presence large moving vehicles urban traffic.",4 "arabic keyphrase extraction using linguistic knowledge machine learning techniques. paper, supervised learning technique extracting keyphrases arabic documents presented. extractor supplied linguistic knowledge enhance efficiency instead relying statistical information term frequency distance. analysis, annotated arabic corpus used extract required lexical features document words. knowledge also includes syntactic rules based part speech tags allowed word sequences extract candidate keyphrases. work, abstract form arabic words used instead stem form represent candidate terms. abstract form hides inflections found arabic words. paper introduces new features keyphrases based linguistic knowledge, capture titles subtitles document. simple anova test used evaluate validity selected features. then, learning model built using lda - linear discriminant analysis - training documents. although, presented system trained using documents domain, experiments carried show significantly better performance existing arabic extractor systems, precision recall values reach double corresponding values systems especially lengthy non-scientific articles.",4 "iit bombay english-hindi parallel corpus. present iit bombay english-hindi parallel corpus. corpus compilation parallel corpora previously available public domain well new parallel corpora collected. corpus contains 1.49 million parallel segments, 694k segments previously available public domain. corpus pre-processed machine translation, report baseline phrase-based smt nmt translation results corpus. corpus used two editions shared tasks workshop asian language transation (2016 2017). corpus freely available non-commercial research. best knowledge, largest publicly available english-hindi parallel corpus.",4 "adaptive seeding gaussian mixture models. present new initialization methods expectation-maximization algorithm multivariate gaussian mixture models. methods adaptions well-known $k$-means++ initialization gonzalez algorithm. thereby aim close gap simple random, e.g. uniform, complex methods, crucially depend right choice hyperparameters. extensive experiments indicate usefulness methods compared common techniques methods, e.g. apply original $k$-means++ gonzalez directly, respect artificial well real-world data sets.",4 "gaussian process domain experts model adaptation facial behavior analysis. present novel approach supervised domain adaptation based upon probabilistic framework gaussian processes (gps). specifically, introduce domain-specific gps local experts facial expression classification face images. adaptation classifier facilitated probabilistic fashion conditioning target expert multiple source experts. furthermore, contrast existing adaptation approaches, also learn target expert available target data solely. then, single confident classifier obtained combining predictions multiple experts based confidence. learning model efficient requires retraining/reweighting source classifiers. evaluate proposed approach two publicly available datasets multi-class (multipie) multi-label (disfa) facial expression classification. end, perform adaptation two contextual factors: 'where' (view) 'who' (subject). show experiments proposed approach consistently outperforms source target classifiers, using 30 target examples. also outperforms state-of-the-art approaches supervised domain adaptation.",19 "revisiting kernelized locality-sensitive hashing improved large-scale image retrieval. present simple powerful reinterpretation kernelized locality-sensitive hashing (klsh), general popular method developed vision community performing approximate nearest-neighbor searches arbitrary reproducing kernel hilbert space (rkhs). new perspective based viewing steps klsh algorithm appropriately projected space, several key theoretical practical benefits. first, eliminates problematic conceptual difficulties present existing motivation klsh. second, yields first formal retrieval performance bounds klsh. third, analysis reveals two techniques boosting empirical performance klsh. evaluate extensions several large-scale benchmark image retrieval data sets, show analysis leads improved recall performance least 12%, sometimes much higher, standard klsh method.",4 "question answering natural language understanding system based object-oriented semantics. algorithms question answering computer system oriented input logical processing text information presented. knowledge domain consideration social behavior person. database system includes internal representation natural language sentences supplemental information. answer {\it yes} {\it no} formed general question. special question containing interrogative word group interrogative words permits find subject, object, place, time, cause, purpose way action event. answer generation based identification algorithms persons, organizations, machines, things, places, times. proposed algorithms question answering realized information systems closely connected text processing (criminology, operation business, medicine, document systems).",4 "functional decision theory: new theory instrumental rationality. paper describes motivates new decision theory known functional decision theory (fdt), distinct causal decision theory evidential decision theory. functional decision theorists hold normative principle action treat one's decision output fixed mathematical function answers question, ""which output function would yield best outcome?"" adhering principle delivers number benefits, including ability maximize wealth array traditional decision-theoretic game-theoretic problems cdt edt perform poorly. using one simple coherent decision rule, functional decision theorists (for example) achieve utility cdt newcomb's problem, utility edt smoking lesion problem, utility parfit's hitchhiker problem. paper, define fdt, explore prescriptions number different decision problems, compare cdt edt, give philosophical justifications fdt normative theory decision-making.",4 "network intrusions detection system based quantum bio inspired algorithm. network intrusion detection systems (nidss) role identifying malicious activities monitoring behavior networks. due currently high volume networks trafic addition increased number attacks dynamic properties, nidss challenge improving classification performance. bio-inspired optimization algorithms (bios) used automatically extract discrimination rules normal abnormal behavior improve classification accuracy detection ability nids. quantum vaccined immune clonal algorithm estimation distribution algorithm (qvica-with eda) proposed paper build new nids. proposed algorithm used classification algorithm new nids trained tested using kdd data set. also, new nids compared another detection system based particle swarm optimization (pso). results shows ability proposed algorithm achieving high intrusions classification accuracy highest obtained accuracy 94.8 %.",4 "simple, efficient, neural algorithms sparse coding. sparse coding basic task many fields including signal processing, neuroscience machine learning goal learn basis enables sparse representation given set data, one exists. standard formulation non-convex optimization problem solved practice heuristics based alternating minimization. re- cent work resulted several algorithms sparse coding provable guarantees, somewhat surprisingly outperformed simple alternating minimization heuristics. give general framework understanding alternating minimization leverage analyze existing heuristics design new ones also provable guarantees. algorithms seem implementable simple neural architectures, original motivation olshausen field (1997a) introducing sparse coding. also give first efficient algorithm sparse coding works almost information theoretic limit sparse recovery incoherent dictionaries. previous algorithms approached surpassed limit run time exponential natural parameter. finally, algorithms improve upon sample complexity existing approaches. believe analysis framework applications settings simple iterative algorithms used.",4 "computational model affects. article provides simple logical structure, affective concepts (i.e. concepts related emotions feelings) defined. set affects defined similar set emotions covered occ model (ortony a., collins a., clore g. l.: cognitive structure emotions. cambridge university press, 1988), model presented article fully computationally defined.",4 "noisy power method: meta algorithm applications. provide new robust convergence analysis well-known power method computing dominant singular vectors matrix call noisy power method. result characterizes convergence behavior algorithm significant amount noise introduced matrix-vector multiplication. noisy power method seen meta-algorithm recently found number important applications broad range machine learning problems including alternating minimization matrix completion, streaming principal component analysis (pca), privacy-preserving spectral analysis. general analysis subsumes several existing ad-hoc convergence bounds resolves number open problems multiple applications including streaming pca privacy-preserving singular vector computation.",4 "approximation guarantees greedy low rank optimization. provide new approximation guarantees greedy low rank matrix estimation standard assumptions restricted strong convexity smoothness. novel analysis also uncovers previously unknown connections low rank estimation combinatorial optimization, much bounds reminiscent corresponding approximation bounds submodular maximization. additionally, also provide statistical recovery guarantees. finally, present empirical comparison greedy estimation established baselines two important real-world problems.",19 "computational models: bottom-up top-down aspects. computational models visual attention become popular past decade, believe primarily two reasons: first, models make testable predictions explored experimentalists well theoreticians, second, models practical technological applications interest applied science engineering communities. chapter, take critical look recent attention modeling efforts. focus {\em computational models attention} defined tsotsos \& rothenstein \shortcite{tsotsos_rothenstein11}: models process visual stimulus (typically, image video clip), possibly also given task definition, make predictions compared human animal behavioral physiological responses elicited stimulus task. thus, place less emphasis abstract models, phenomenological models, purely data-driven fitting extrapolation models, models specifically designed single task restricted class stimuli. theoretical models, refer reader number previous reviews address attention theories models generally \cite{itti_koch01nrn,paletta_etal05,frintrop_etal10,rothenstein_tsotsos08,gottlieb_balan10,toet11,borji_itti12pami}.",4 "bridging neural machine translation bilingual dictionaries. neural machine translation (nmt) become new state-of-the-art several language pairs. however, remains challenging problem integrate nmt bilingual dictionary mainly contains words rarely never seen bilingual training data. paper, propose two methods bridge nmt bilingual dictionaries. core idea behind design novel models transform bilingual dictionaries adequate sentence pairs, nmt distil latent bilingual mappings ample repetitive phenomena. one method leverages mixed word/character model attempts synthesizing parallel sentences guaranteeing massive occurrence translation lexicon. extensive experiments demonstrate proposed methods remarkably improve translation quality, rare words test sentences obtain correct translations covered dictionary.",4 "generic deep networks wavelet scattering. introduce two-layer wavelet scattering network, object classification. scattering transform computes spatial wavelet transform first layer new joint wavelet transform along spatial, angular scale variables second layer. numerical experiments demonstrate two layer convolution network, involves learning max pooling, performs efficiently complex image data sets caltech, structural objects variability clutter. opens possibility simplify deep neural network learning initializing first layers wavelet filters.",4 "informed sampler: discriminative approach bayesian inference generative computer vision models. computer vision hard large variability lighting, shape, texture; addition image signal non-additive due occlusion. generative models promised account variability accurately modelling image formation process function latent variables prior beliefs. bayesian posterior inference could then, principle, explain observation. intuitively appealing, generative models computer vision largely failed deliver promise due difficulty posterior inference. result community favoured efficient discriminative approaches. still believe usefulness generative models computer vision, argue need leverage existing discriminative even heuristic computer vision methods. implement idea principled way ""informed sampler"" careful experiments demonstrate challenging generative models contain renderer programs components. concentrate problem inverting existing graphics rendering engine, approach understood ""inverse graphics"". informed sampler, using simple discriminative proposals based existing computer vision technology, achieves significant improvements inference.",4 "optimal auctions deep learning. designing auction maximizes expected revenue intricate task. indeed, today--despite major efforts impressive progress past years--only single-item case fully understood. work, initiate exploration use tools deep learning topic. design objective revenue optimal, dominant-strategy incentive compatible auctions. show multi-layer neural networks learn almost-optimal auctions settings analytical solutions, myerson's auction single item, manelli vincent's mechanism single bidder additive preferences two items, yao's auction two additive bidders binary support distributions multiple items, even prior knowledge form optimal auctions encoded network feedback training revenue regret. show characterization results, even rather implicit ones rochet's characterization induced utilities gradients, leveraged obtain precise fits optimal design. conclude demonstrating potential deep learning deriving optimal auctions high revenue poorly understood problems.",4 "multi-step-ahead time series prediction using multiple-output support vector regression. accurate time series prediction long future horizons challenging great interest practitioners academics. well-known intelligent algorithm, standard formulation support vector regression (svr) could taken multi-step-ahead time series prediction, relying either iterated strategy direct strategy. study proposes novel multiple-step-ahead time series prediction approach employs multiple-output support vector regression (m-svr) multiple-input multiple-output (mimo) prediction strategy. addition, rank three leading prediction strategies svr comparatively examined, providing practical implications selection prediction strategy multi-step-ahead forecasting taking svr modeling technique. proposed approach validated simulated real datasets. quantitative comprehensive assessments performed basis prediction accuracy computational cost. results indicate that: 1) m-svr using mimo strategy achieves best accurate forecasts accredited computational load, 2) standard svr using direct strategy achieves second best accurate forecasts, expensive computational cost, 3) standard svr using iterated strategy worst terms prediction accuracy, least computational cost.",4 "role zero synapses unsupervised feature learning. synapses real neural circuits take discrete values, including zero (silent potential) synapses. computational role zero synapses unsupervised feature learning unlabeled noisy data still unclear, thus important understand sparseness synaptic activity shaped learning relationship receptive field formation. here, formulate kind sparse feature learning statistical mechanics approach. find learning decreases fraction zero synapses, fraction decreases rapidly around critical data size, intrinsically structured receptive field starts develop. increasing data size refines receptive field, small fraction zero synapses remain act contour detectors. phenomenon discovered learning handwritten digits dataset, also learning retinal neural activity measured natural-movie-stimuli experiment.",16 "causal network inference via group sparse regularization. paper addresses problem inferring sparse causal networks modeled multivariate auto-regressive (mar) processes. conditions derived group lasso (glasso) procedure consistently estimates sparse network structure. key condition involves ""false connection score."" particular, show consistent recovery possible even number observations network far less number parameters describing network, provided false connection score less one. false connection score also demonstrated useful metric recovery non-asymptotic regimes. conditions suggest modified glasso procedure tends improve false connection score reduce chances reversing direction causal influence. computational experiments real network based electrocorticogram (ecog) simulation study demonstrate effectiveness approach.",19 "infochemical core. vocalizations less often gestures object linguistic research decades. however, development general theory communication human language particular case requires clear understanding organization communication means. infochemicals chemical compounds carry information employed small organisms cannot emit acoustic signals optimal frequency achieve successful communication. distribution infochemicals across species investigated ranked degree number species associated (because produce sensitive it). quality fit different functions dependency degree rank evaluated penalty number parameters function. surprisingly, double zipf (a zipf distribution two regimes different exponent each) model yielding best fit although function largest number parameters. suggests world wide repertoire infochemicals contains chemical nucleus shared many species reminiscent core vocabularies found human language dictionaries large corpora.",16 "deep reconstruction-classification networks unsupervised domain adaptation. paper, propose novel unsupervised domain adaptation algorithm based deep learning visual object recognition. specifically, design new model called deep reconstruction-classification network (drcn), jointly learns shared encoding representation two tasks: i) supervised classification labeled source data, ii) unsupervised reconstruction unlabeled target data.in way, learnt representation preserves discriminability, also encodes useful information target domain. new drcn model optimized using backpropagation similarly standard neural networks. evaluate performance drcn series cross-domain object recognition tasks, drcn provides considerable improvement (up ~8% accuracy) prior state-of-the-art algorithms. interestingly, also observe reconstruction pipeline drcn transforms images source domain images whose appearance resembles target dataset. suggests drcn's performance due constructing single composite representation encodes information structure target images classification source images. finally, provide formal analysis justify algorithm's objective domain adaptation context.",4 "continuous time dynamic topic models. paper, develop continuous time dynamic topic model (cdtm). cdtm dynamic topic model uses brownian motion model latent topics sequential collection documents, ""topic"" pattern word use expect evolve course collection. derive efficient variational approximate inference algorithm takes advantage sparsity observations text, property lets us easily handle many time points. contrast cdtm, original discrete-time dynamic topic model (ddtm) requires time discretized. moreover, complexity variational inference ddtm grows quickly time granularity increases, drawback limits fine-grained discretization. demonstrate cdtm two news corpora, reporting predictive perplexity novel task time stamp prediction.",4 speeding-up decision making learning agent using ion trap quantum processor. report proof-of-principle experimental demonstration quantum speed-up learning agents utilizing small-scale quantum information processor based radiofrequency-driven trapped ions. decision-making process quantum learning agent within projective simulation paradigm machine learning implemented system two qubits. latter realized using hyperfine states two frequency-addressed atomic ions exposed static magnetic field gradient. show deliberation time quantum learning agent quadratically improved respect comparable classical learning agents. performance quantum-enhanced learning agent highlights potential scalable quantum processors taking advantage machine learning.,18 "material classification wild: synthesized training data generalise better real-world training data?. question dominant role real-world training images field material classification investigating whether synthesized data generalise effectively real-world data. experimental results three challenging real-world material databases show best performing pre-trained convolutional neural network (cnn) architectures achieve 91.03% mean average precision classifying materials cross-dataset scenarios. demonstrate synthesized data achieve improvement mean average precision used training data conjunction pre-trained cnn architectures, spans ~ 5% ~ 19% across three widely used material databases real-world images.",4 "linear algorithm digital euclidean connected skeleton. skeleton essential shape characteristic providing compact representation studied shape. computation image grid raises many issues. due effects discretization, required properties skeleton - thinness, homotopy shape, reversibility, connectivity - may become incompatible. however, regards practical use, choice specific skeletonization algorithm depends application. allows classify desired properties order importance, tend towards critical ones. goal make skeleton dedicated shape matching recognition. so, discrete skeleton thin - represented graph -, robust noise, reversible - initial shape fully reconstructed - homotopic shape. propose linear-time skeletonization algorithm based squared euclidean distance map extract maximal balls ridges. thinning pruning process, obtain skeleton. proposed method finally compared fairly recent methods.",4 "outlying property detection numerical attributes. outlying property detection problem problem discovering properties distinguishing given object, known advance outlier database, database objects. paper, analyze problem within context numerical attributes taken account, represents relevant case left open literature. introduce measure quantify degree outlierness object, associated relative likelihood value, compared relative likelihood objects database. major contribution, present efficient algorithm compute outlierness relative significant subsets data. latter subsets characterized ""rule-based"" fashion, hence basis underlying explanation outlierness.",4 "relax localize: value algorithms. show principled way deriving online learning algorithms minimax analysis. various upper bounds minimax value, previously thought non-constructive, shown yield algorithms. allows us seamlessly recover known methods derive new ones. framework also captures ""unorthodox"" methods follow perturbed leader r^2 forecaster. emphasize understanding inherent complexity learning problem leads development algorithms. define local sequential rademacher complexities associated algorithms allow us obtain faster rates online learning, similarly statistical learning theory. based localized complexities build general adaptive method take advantage suboptimality observed sequence. present number new algorithms, including family randomized methods use idea ""random playout"". several new versions follow-the-perturbed-leader algorithms presented, well methods based littlestone's dimension, efficient methods matrix completion trace norm, algorithms problems transductive learning prediction static experts.",4 "traffic sign classification using deep inception based convolutional networks. work, propose novel deep network traffic sign classification achieves outstanding performance gtsrb surpassing previous methods. deep network consists spatial transformer layers modified version inception module specifically designed capturing local global features together. features adoption allows network classify precisely intraclass samples even deformations. use spatial transformer layer makes network robust deformations translation, rotation, scaling input images. unlike existing approaches developed hand-crafted features, multiple deep networks huge parameters data augmentations, method addresses concern exploding parameters augmentations. achieved state-of-the-art performance 99.81\% gtsrb dataset.",4 "detecting blackholes volcanoes directed networks. paper, formulate novel problem finding blackhole volcano patterns large directed graph. specifically, blackhole pattern group made set nodes way inlinks group rest nodes graph. contrast, volcano pattern group outlinks rest nodes graph. patterns observed real world. instance, trading network, blackhole pattern may represent group traders manipulating market. paper, first prove blackhole mining problem dual problem finding volcanoes. therefore, focus finding blackhole patterns. along line, design two pruning schemes guide blackhole finding process. first pruning scheme, strategically prune search space based set pattern-size-independent pruning rules develop iblackhole algorithm. second pruning scheme follows divide-and-conquer strategy exploit pruning results first pruning scheme. indeed, target directed graphs divided several disconnected subgraphs first pruning scheme, thus blackhole finding conducted disconnected subgraph rather large graph. based two pruning schemes, also develop iblackhole-dc algorithm. finally, experimental results real-world data show iblackhole-dc algorithm several orders magnitude faster iblackhole algorithm, huge computational advantage brute-force method.",4 "parallel chromatic mcmc spatial partitioning. introduce novel approach parallelizing mcmc inference models spatially determined conditional independence relationships, existing techniques exploiting graphical model structure applicable. approach motivated model seismic events signals, events detected distant regions approximately independent given intermediate regions. perform parallel inference coloring factor graph defined regions latent space, rather individual model variables. evaluating model seismic event detection, achieve significant speedups serial mcmc degradation inference quality.",19 "gradient estimation using stochastic computation graphs. variety problems originating supervised, unsupervised, reinforcement learning, loss function defined expectation collection random variables, might part probabilistic model external world. estimating gradient loss function, using samples, lies core gradient-based learning algorithms problems. introduce formalism stochastic computation graphs---directed acyclic graphs include deterministic functions conditional probability distributions---and describe easily automatically derive unbiased estimator loss function's gradient. resulting algorithm computing gradient estimator simple modification standard backpropagation algorithm. generic scheme propose unifies estimators derived variety prior work, along variance-reduction techniques therein. could assist researchers developing intricate models involving combination stochastic deterministic operations, enabling, example, attention, memory, control actions.",4 "causal discovery binary exclusive-or skew acyclic model: bexsam. discovering causal relations among observed variables given data set major objective studies statistics artificial intelligence. recently, techniques discover unique causal model explored based non-gaussianity observed data distribution. however, limited continuous data. paper, present novel causal model binary data propose efficient new approach deriving unique causal model governing given binary data set skew distributions external binary noises. experimental evaluation shows excellent performance artificial real world data sets.",19 "total-order partial-order planning: comparative analysis. many years, intuitions underlying partial-order planning largely taken granted. past years renewed interest fundamental principles underlying paradigm. paper, present rigorous comparative analysis partial-order total-order planning focusing two specific planners directly compared. show subtle assumptions underly wide-spread intuitions regarding supposed efficiency partial-order planning. instance, superiority partial-order planning depend critically upon search strategy structure search space. understanding underlying assumptions crucial constructing efficient planners.",4 "guaranteed clustering biclustering via semidefinite programming. identifying clusters similar objects data plays significant role wide range applications. model problem clustering, consider densest k-disjoint-clique problem, whose goal identify collection k disjoint cliques given weighted complete graph maximizing sum densities complete subgraphs induced cliques. paper, establish conditions ensuring exact recovery densest k cliques given graph optimal solution particular semidefinite program. particular, semidefinite relaxation exact input graphs corresponding data consisting k large, distinct clusters smaller number outliers. approach also yields semidefinite relaxation biclustering problem similar recovery guarantees. given set objects set features exhibited objects, biclustering seeks simultaneously group objects features according expression levels. problem may posed partitioning nodes weighted bipartite complete graph sum densities resulting bipartite complete subgraphs maximized. analysis densest k-disjoint-clique problem, show correct partition objects features recovered optimal solution semidefinite program case given data consists several disjoint sets objects exhibiting similar features. empirical evidence numerical experiments supporting theoretical guarantees also provided.",12 "used neural networks detect clickbaits: believe happened next!. online content publishers often use catchy headlines articles order attract users websites. headlines, popularly known clickbaits, exploit user's curiosity gap lure click links often disappoint them. existing methods automatically detecting clickbaits rely heavy feature engineering domain knowledge. here, introduce neural network architecture based recurrent neural networks detecting clickbaits. model relies distributed word representations learned large unannotated corpora, character embeddings learned via convolutional neural networks. experimental results dataset news headlines show model outperforms existing techniques clickbait detection accuracy 0.98 f1-score 0.98 roc-auc 0.99.",4 "learning action models: qualitative approach. dynamic epistemic logic, actions described using action models. paper introduce framework studying learnability action models observations. present first results concerning propositional action models. first check two basic learnability criteria: finite identifiability (conclusively inferring appropriate action model finite time) identifiability limit (inconclusive convergence right action model). show deterministic actions finitely identifiable, non-deterministic actions require learning power-they identifiable limit. move particular learning method, proceeds via restriction space events within learning-specific action model. way learning closely resembles well-known update method dynamic epistemic logic. introduce several different learning methods suited finite identifiability particular types deterministic actions.",4 "learning decode linear codes using deep learning. novel deep learning method improving belief propagation algorithm proposed. method generalizes standard belief propagation algorithm assigning weights edges tanner graph. edges trained using deep learning techniques. well-known property belief propagation algorithm independence performance transmitted codeword. crucial property new method decoder preserved property. furthermore, property allows us learn single codeword instead exponential number code-words. improvements belief propagation algorithm demonstrated various high density parity check codes.",4 "comparing deep neural networks humans: object recognition signal gets weaker. human visual object recognition typically rapid seemingly effortless, well largely independent viewpoint object orientation. recently, animate visual systems ones capable remarkable computational feat. changed rise class computer vision algorithms called deep neural networks (dnns) achieve human-level classification performance object recognition tasks. furthermore, growing number studies report similarities way dnns human visual system process objects, suggesting current dnns may good models human visual object recognition. yet clearly exist important architectural processing differences state-of-the-art dnns primate visual system. potential behavioural consequences differences well understood. aim address issue comparing human dnn generalisation abilities towards image degradations. find human visual system robust image manipulations like contrast reduction, additive noise novel eidolon-distortions. addition, find progressively diverging classification error-patterns man dnns signal gets weaker, indicating may still marked differences way humans current dnns perform visual object recognition. envision findings well carefully measured freely available behavioural datasets provide new useful benchmark computer vision community improve robustness dnns motivation neuroscientists search mechanisms brain could facilitate robustness.",4 "variational recurrent neural machine translation. partially inspired successful applications variational recurrent neural networks, propose novel variational recurrent neural machine translation (vrnmt) model paper. different variational nmt, vrnmt introduces series latent random variables model translation procedure sentence generative way, instead single latent variable. specifically, latent random variables included hidden states nmt decoder elements variational autoencoder. way, variables recurrently generated, enables capture strong complex dependencies among output translations different timesteps. order deal challenges performing efficient posterior inference large-scale training incorporation latent variables, build neural posterior approximator, equip reparameterization technique estimate variational lower bound. experiments chinese-english english-german translation tasks demonstrate proposed model achieves significant improvements conventional variational nmt models.",4 "stochastic generative hashing. learning-based binary hashing become powerful paradigm fast search retrieval massive databases. however, due requirement discrete outputs hash functions, learning functions known challenging. addition, objective functions adopted existing hashing techniques mostly chosen heuristically. paper, propose novel generative approach learn hash functions minimum description length principle learned hash codes maximally compress dataset also used regenerate inputs. also develop efficient learning algorithm based stochastic distributional gradient, avoids notorious difficulty caused binary output constraints, jointly optimize parameters hash function associated generative model. extensive experiments variety large-scale datasets show proposed method achieves better retrieval results existing state-of-the-art methods.",4 "using mechanical turk build machine translation evaluation sets. building machine translation (mt) test sets relatively expensive task. mt becomes increasingly desired language pairs domains, becomes necessary build test sets case. paper, investigate using amazon's mechanical turk (mturk) make mt test sets cheaply. find mturk used make test sets much cheaper professionally-produced test sets. importantly, experiments multiple mt systems, find mturk-produced test sets yield essentially conclusions regarding system performance professionally-produced test sets yield.",4 "marginal simultaneous predictive classification using stratified graphical models. inductive probabilistic classification rule must generally obey principles bayesian predictive inference, observed unobserved stochastic quantities jointly modeled parameter uncertainty fully acknowledged posterior predictive distribution. several rules recently considered asymptotic behavior characterized assumption observed features variables used building classifier conditionally independent given simultaneous labeling training samples unknown origin. extend theoretical results predictive classifiers acknowledging feature dependencies either graphical models sparser alternatives defined stratified graphical models. also show experimentation synthetic real data predictive classifiers based stratified graphical models consistently best accuracy compared predictive classifiers based either conditionally independent features ordinary graphical models.",19 "end-to-end weakly-supervised semantic alignment. tackle task semantic alignment goal compute dense semantic correspondence aligning two images depicting objects category. challenging task due large intra-class variation, changes viewpoint background clutter. present following three principal contributions. first, develop convolutional neural network architecture semantic alignment trainable end-to-end manner weak image-level supervision form matching image pairs. outcome parameters learnt rich appearance variation present different semantically related images without need tedious manual annotation correspondences training time. second, main component architecture differentiable soft inlier scoring module, inspired ransac inlier scoring procedure, computes quality alignment based geometrically consistent correspondences thereby reducing effect background clutter. third, demonstrate proposed approach achieves state-of-the-art performance multiple standard benchmarks semantic alignment.",4 "shattered gradients problem: resnets answer, question?. long-standing obstacle progress deep learning problem vanishing exploding gradients. problem largely overcome introduction carefully constructed initializations batch normalization. nevertheless, architectures incorporating skip-connections resnets perform much better standard feedforward architectures despite well-chosen initialization batch normalization. paper, identify shattered gradients problem. specifically, show correlation gradients standard feedforward networks decays exponentially depth resulting gradients resemble white noise. contrast, gradients architectures skip-connections far resistant shattering decaying sublinearly. detailed empirical evidence presented support analysis, fully-connected networks convnets. finally, present new ""looks linear"" (ll) initialization prevents shattering. preliminary experiments show new initialization allows train deep networks without addition skip-connections.",4 "proceedings fifth workshop developments computational models--computational models nature. special theme dcm 2009, co-located icalp 2009, concerned computational models nature, particular emphasis computational models derived physics biology. intention bring together different approaches - community strong foundational background proffered icalp attendees - create inspirational cross-boundary exchanges, lead innovative research. specifically dcm 2009 sought contributions quantum computation information, probabilistic models, chemical, biological bio-inspired ones, including spatial models, growth models models self-assembly. contributions putting test logical algorithmic aspects computing (e.g., continuous computing dynamical systems, solid state computing models) also much welcomed.",4 "diachronic word embeddings reveal statistical laws semantic change. understanding words change meanings time key models language cultural evolution, historical data meaning scarce, making theories hard develop test. word embeddings show promise diachronic tool, carefully evaluated. develop robust methodology quantifying semantic change evaluating word embeddings (ppmi, svd, word2vec) known historical changes. use methodology reveal statistical laws semantic evolution. using six historical corpora spanning four languages two centuries, propose two quantitative laws semantic change: (i) law conformity---the rate semantic change scales inverse power-law word frequency; (ii) law innovation---independent frequency, words polysemous higher rates semantic change.",4 "classification approaches challenges frequent subgraphs mining biological networks. understanding structure dynamics biological networks one important challenges system biology. addition, increasing amount experimental data biological networks necessitate use efficient methods analyze huge amounts data. methods require recognize common patterns analyze data. biological networks modeled graphs, problem common patterns recognition equivalent frequent sub graph mining set graphs. paper, first challenges frequent subgrpahs mining biological networks introduced existing approaches classified challenge. algorithms analyzed basis type approach apply challenges.",4 "weakly submodular maximization beyond cardinality constraints: randomization help greedy?. submodular functions broad class set functions, naturally arise diverse areas. many algorithms suggested maximization functions. unfortunately, function deviates submodularity, known algorithms may perform arbitrarily poorly. amending issue, obtaining approximation results set functions generalizing submodular functions, focus recent works. one class, known weakly submodular functions, received lot attention. key result proved das kempe (2011) showed approximation ratio greedy algorithm weakly submodular maximization subject cardinality constraint degrades smoothly distance submodularity. however, results obtained maximization subject constraints beyond cardinality. particular, known whether greedy algorithm achieves non-trivial approximation ratio constraints. paper, prove randomized version greedy algorithm (previously used buchbinder et al. (2014) different problem) achieves approximation ratio $(1 + 1/\gamma)^{-2}$ maximization weakly submodular function subject general matroid constraint, $\gamma$ parameter measuring distance function submodularity. moreover, also experimentally compare performance version greedy algorithm real world problems natural benchmarks, show algorithm study performs well also practice. best knowledge, first algorithm non-trivial approximation guarantee maximizing weakly submodular function subject constraint simple cardinality constraint. particular, first algorithm guarantee important broad class matroid constraints.",4 "visualizing loss landscape neural nets. neural network training relies ability find ""good"" minimizers highly non-convex loss functions. well known certain network architecture designs (e.g., skip connections) produce loss functions train easier, well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers generalize better. however, reasons differences, effect underlying loss landscape, well understood. paper, explore structure neural loss functions, effect loss landscapes generalization, using range visualization methods. first, introduce simple ""filter normalization"" method helps us visualize loss function curvature, make meaningful side-by-side comp arisons loss functions. then, using variety visualizations, explore network architecture affects loss landscape, training parameters affect shape minimizers.",4 "zipf's law emerges asymptotically phase transitions communicative systems. zipf's law predicts power-law relationship word rank frequency language communication systems, widely reported texts yet remains enigmatic origins. computer simulations shown language communication systems emerge abrupt phase transition fidelity mappings symbols objects. since phase transition approximates heaviside step function, show zipfian scaling emerges asymptotically high rank based laplace transform. thereby demonstrate zipf's law gradually emerges moment phase transition communicative systems. show power-law scaling behavior explains emergence natural languages phase transitions. find emergence zipf's law language communication suggests use rare words lexicon critical construction effective communicative system phase transition.",15 "drunet: dilated-residual u-net deep learning network digitally stain optic nerve head tissues optical coherence tomography images. given neural connective tissues optic nerve head (onh) exhibit complex morphological changes development progression glaucoma, simultaneous isolation optical coherence tomography (oct) images may great interest clinical diagnosis management pathology. deep learning algorithm designed trained digitally stain (i.e. highlight) 6 onh tissue layers capturing local (tissue texture) contextual information (spatial arrangement tissues). overall dice coefficient (mean tissues) $0.91 \pm 0.05$ assessed manual segmentations performed expert observer. offer robust segmentation framework could extended automated parametric study onh tissues.",4 "chromatag: colored marker fast detection algorithm. current fiducial marker detection algorithms rely marker ids false positive rejection. time wasted potential detections eventually rejected false positives. introduce chromatag, fiducial marker detection algorithm designed use opponent colors limit quickly reject initial false detections grayscale precise localization. experiments, show chromatag significantly faster current fiducial markers achieving similar better detection accuracy. also show tag size viewing direction effect detection accuracy. contribution significant fiducial markers often used real-time applications (e.g. marker assisted robot navigation) heavy computation required parts system.",4 "end-to-end learning action detection frame glimpses videos. work introduce fully end-to-end approach action detection videos learns directly predict temporal bounds actions. intuition process detecting actions naturally one observation refinement: observing moments video, refining hypotheses action occurring. based insight, formulate model recurrent neural network-based agent interacts video time. agent observes video frames decides look next emit prediction. since backpropagation adequate non-differentiable setting, use reinforce learn agent's decision policy. model achieves state-of-the-art results thumos'14 activitynet datasets observing fraction (2% less) video frames.",4 "rule-based emotion detection social media: putting tweets plutchik's wheel. study sentiment analysis beyond typical granularity polarity instead use plutchik's wheel emotions model. introduce rbem-emo extension rule-based emission model algorithm deduce emotions human-written messages. evaluate approach two different datasets compare performance current state-of-the-art techniques emotion detection, including recursive auto-encoder. results experimental study suggest rbem-emo promising approach advancing current state-of-the-art emotion detection.",4 "stochastic dual coordinate ascent methods regularized loss minimization. stochastic gradient descent (sgd) become popular solving large scale supervised machine learning optimization problems svm, due strong theoretical guarantees. closely related dual coordinate ascent (dca) method implemented various software packages, far lacked good convergence analysis. paper presents new analysis stochastic dual coordinate ascent (sdca) showing class methods enjoy strong theoretical guarantees comparable better sgd. analysis justifies effectiveness sdca practical applications.",19 "learning polynomial networks classification clinical electroencephalograms. describe polynomial network technique developed learning classify clinical electroencephalograms (eegs) presented noisy features. using evolutionary strategy implemented within group method data handling, learn classification models comprehensively described sets short-term polynomials. polynomial models learnt classify eegs recorded alzheimer healthy patients recognize eeg artifacts. comparing performances technique machine learning methods conclude technique learn well-suited polynomial models experts find easy-to-understand.",4 "learning attend, copy, generate session-based query suggestion. users try articulate complex information needs search sessions reformulating queries. make process effective, search engines provide related queries help users specifying information need search process. paper, propose customized sequence-to-sequence model session-based query suggestion. model, employ query-aware attention mechanism capture structure session context. enables us control scope session infer suggested next query, helps handle noisy data also automatically detect session boundaries. furthermore, observe that, based user query reformulation behavior, within single session large portion query terms retained previously submitted queries consists mostly infrequent unseen terms usually included vocabulary. therefore empower decoder model access source words session context decoding incorporating copy mechanism. moreover, propose evaluation metrics assess quality generative models query suggestion. conduct extensive set experiments analysis. e results suggest model outperforms baselines terms generating queries scoring candidate queries task query suggestion.",4 "author-topic model authors documents. introduce author-topic model, generative model documents extends latent dirichlet allocation (lda; blei, ng, & jordan, 2003) include authorship information. author associated multinomial distribution topics topic associated multinomial distribution words. document multiple authors modeled distribution topics mixture distributions associated authors. apply model collection 1,700 nips conference papers 160,000 citeseer abstracts. exact inference intractable datasets use gibbs sampling estimate topic author distributions. compare performance two generative models documents, special cases author-topic model: lda (a topic model) simple author model author associated distribution words rather distribution topics. show topics recovered author-topic model, demonstrate applications computing similarity authors entropy author output.",4 "equivalence distance-based rkhs-based statistics hypothesis testing. provide unifying framework linking two classes statistics used two-sample independence testing: one hand, energy distances distance covariances statistics literature; other, maximum mean discrepancies (mmd), is, distances embeddings distributions reproducing kernel hilbert spaces (rkhs), established machine learning. case energy distance computed semimetric negative type, positive definite kernel, termed distance kernel, may defined mmd corresponds exactly energy distance. conversely, positive definite kernel, interpret mmd energy distance respect negative-type semimetric. equivalence readily extends distance covariance using kernels product space. determine class probability distributions test statistics consistent alternatives. finally, investigate performance family distance kernels two-sample independence tests: show particular energy distance commonly employed statistics one member parametric family kernels, choices family yield powerful tests.",19 "predicting co-evolution event knowledge graphs. embedding learning, a.k.a. representation learning, shown able model large-scale semantic knowledge graphs. key concept mapping knowledge graph tensor representation whose entries predicted models using latent representations generalized entities. knowledge graphs typically treated static: knowledge graph grows links facts become available ground truth values associated links considered time invariant. paper address issue knowledge graphs triple states depend time. assume changes knowledge graph always arrive form events, sense events gateway knowledge graph. train event prediction model uses knowledge graph background information information recent events. predicting future events, also predict likely changes knowledge graph thus obtain model evolution knowledge graph well. experiments demonstrate approach performs well clinical application, recommendation engine sensor network application.",4 "classification approach based association rules mining unbalanced data. paper deals binary classification task target class lower probability occurrence. situation, possible build powerful classifier using standard methods logistic regression, classification tree, discriminant analysis, etc. overcome short-coming methods yield classifiers low sensibility, tackled classification problem approach based association rules learning. approach advantage allowing identification patterns well correlated target class. association rules learning well known method area data-mining. used dealing large database unsupervised discovery local patterns expresses hidden relationships input variables. considering association rules supervised learning point view, relevant set weak classifiers obtained one derives classifier performs well.",19 "valid optimal assignment kernels applications graph classification. success kernel methods initiated design novel positive semidefinite functions, particular structured data. leading design paradigm convolution kernel, decomposes structured objects parts sums pairs parts. assignment kernels, contrast, obtained optimal bijection parts, provide valid notion similarity. general however, optimal assignments yield indefinite functions, complicates use kernel methods. characterize class base kernels used compare parts guarantees positive semidefinite optimal assignment kernels. base kernels give rise hierarchies optimal assignment kernels computed linear time histogram intersection. apply results developing weisfeiler-lehman optimal assignment kernel graphs. provides high classification accuracy widely-used benchmark data sets improving original weisfeiler-lehman kernel.",4 "towards label imbalance multi-label classification many labels. multi-label classification, instance may associated set labels simultaneously. recently, research multi-label classification largely shifted focus end spectrum number labels assumed extremely large. existing works focus design scalable algorithms offer fast training procedures small memory footprint. however ignore even compound another challenge - label imbalance problem. address drawback, propose novel representation-based multi-label learning sampling (rmls) approach. best knowledge, first tackle imbalance problem multi-label classification many labels. experimentations real-world datasets demonstrate effectiveness proposed approach.",4 "discussion validation tests employed compare human action recognition methods using msr action3d dataset. paper aims determine best human action recognition method based features extracted rgb-d devices, microsoft kinect. review papers make reference msr action3d, used dataset includes depth information acquired rgb-d device, performed. found validation method used work differs others. so, direct comparison among works cannot made. however, almost works present results comparing without taking account issue. therefore, present different rankings according methodology used validation orden clarify existing confusion.",4 "ant colony algorithm weighted item layout optimization problem. paper discusses problem placing weighted items circular container two-dimensional space. problem great practical significance various mechanical engineering domains, design communication satellites. two constructive heuristics proposed, one packing circular items packing rectangular items. work first optimizing object placement order, optimizing object positioning. based heuristics, ant colony optimization (aco) algorithm described search first optimal positioning order, optimal layout. describe results numerical experiments, test two versions aco algorithm alongside local search methods previously described literature. results show constructive heuristic-based aco performs better existing methods larger problem instances.",4 "mga trajectory planning aco-inspired algorithm. given set celestial bodies, problem finding optimal sequence swing-bys, deep space manoeuvres (dsm) transfer arcs connecting elements set combinatorial nature. number possible paths grows exponentially number celestial bodies. therefore, design optimal multiple gravity assist (mga) trajectory np-hard mixed combinatorial-continuous problem. automated solution would greatly improve design future space missions, allowing assessment large number alternative mission options short time. work proposes formulate complete automated design multiple gravity assist trajectory autonomous planning scheduling problem. resulting scheduled plan provide optimal planetary sequence good estimation set associated optimal trajectories. trajectory model consists sequence celestial bodies connected twodimensional transfer arcs containing one dsm. transfer arc, position planet spacecraft, time arrival, matched varying pericentre preceding swing-by, magnitude launch excess velocity, first arc. departure date, model generates full tree possible transfers departure destination planet. leaf tree represents planetary encounter possible way reach planet. algorithm inspired ant colony optimization (aco) devised explore space possible plans. ants explore tree departure destination adding one node time: every time ant node, probability function used select feasible direction. approach automatic trajectory planning applied design optimal transfers saturn among galilean moons jupiter.",4 "first-order methods almost always avoid saddle points. establish first-order methods avoid saddle points almost initializations. results apply wide variety first-order methods, including gradient descent, block coordinate descent, mirror descent variants thereof. connecting thread algorithms studied dynamical systems perspective appropriate instantiations stable manifold theorem allow global stability analysis. thus, neither access second-order derivative information randomness beyond initialization necessary provably avoid saddle points.",19 "parallel tracking verifying: framework real-time high accuracy visual tracking. intensively studied, visual tracking seen great recent advances either speed (e.g., correlation filters) accuracy (e.g., deep features). real-time high accuracy tracking algorithms, however, remain scarce. paper study problem new perspective present novel parallel tracking verifying (ptav) framework, taking advantage ubiquity multi-thread techniques borrowing success parallel tracking mapping visual slam. ptav framework typically consists two components, tracker verifier v, working parallel two separate threads. tracker aims provide super real-time tracking inference expected perform well time; contrast, verifier v checks tracking results corrects needed. key innovation that, v work every frame upon requests t; end, may adjust tracking according feedback v. collaboration, ptav enjoys high efficiency provided strong discriminative power v. extensive experiments popular benchmarks including otb2013, otb2015, tc128 uav20l, ptav achieves best tracking accuracy among real-time trackers, fact performs even better many deep learning based solutions. moreover, general framework, ptav flexible great rooms improvement generalization.",4 "dual supervised learning. many supervised learning tasks emerged dual forms, e.g., english-to-french translation vs. french-to-english translation, speech recognition vs. text speech, image classification vs. image generation. two dual tasks intrinsic connections due probabilistic correlation models. connection is, however, effectively utilized today, since people usually train models two dual tasks separately independently. work, propose training models two dual tasks simultaneously, explicitly exploiting probabilistic correlation regularize training process. ease reference, call proposed approach \emph{dual supervised learning}. demonstrate dual supervised learning improve practical performances tasks, various applications including machine translation, image processing, sentiment analysis.",4 "echo state condition critical point. recurrent networks transfer functions fulfill lipschitz continuity k=1 may echo state networks certain limitations recurrent connectivity applied. shown sufficient largest singular value recurrent connectivity smaller 1. main achievement paper proof conditions network echo state network even largest singular value one. turns critical case exact shape transfer function plays decisive role determining whether network still fulfills echo state condition. addition, several examples one neuron networks outlined illustrate effects critical connectivity. moreover, within manuscript mathematical definition critical echo state network suggested.",4 "random feedback weights support learning deep neural networks. brain processes information many layers neurons. deep architecture representationally powerful, complicates learning making hard identify responsible neurons mistake made. machine learning, backpropagation algorithm assigns blame neuron computing exactly contributed error. this, multiplies error signals matrices consisting synaptic weights neuron's axon farther downstream. operation requires precisely choreographed transport synaptic weight information, thought impossible brain. present surprisingly simple algorithm deep learning, assigns blame multiplying error signals random synaptic weights. show network learn extract useful information signals sent random feedback connections. essence, network learns learn. demonstrate new mechanism performs quickly accurately backpropagation variety problems describe principles underlie function. demonstration provides plausible basis neuron adapted using error signals generated distal locations brain, thus dispels long-held assumptions algorithmic constraints learning neural circuits.",16 "horn: system parallel training regularizing large-scale neural networks. introduce new distributed system effective training regularizing large-scale neural networks distributed computing architectures. experiments demonstrate effectiveness flexible model partitioning parallelization strategies based neuron-centric computation model, implementation collective parallel dropout neural networks training. experiments performed mnist handwritten digits classification including results.",4 "complexity curve fitting algorithms. study popular algorithm fitting polynomial curves scattered data based least squares gradient weights. show sometimes algorithm admits substantial reduction complexity, and, furthermore, find precise conditions possible. turns is, indeed, possible one fits circles ellipses hyperbolas.",4 "offline handwritten signature identification using adaptive window positioning techniques. paper presents address challenge, proposed use adaptive window positioning technique focuses meaning handwritten signature also individuality writer. innovative technique divides handwritten signature 13 small windows size nxn(13x13).this size large enough contain ample information style author small enough ensure good identification performance.the process tested gpds data set containing 4870 signature samples 90 different writers comparing robust features test signature user signature using appropriate classifier. experimental results reveal adaptive window positioning technique proved efficient reliable method accurate signature feature extraction identification offline handwritten signatures.the contribution technique used detect signatures signed emotional duress.",4 "language structure n-object naming game. examine naming game two agents trying establish common vocabulary n objects. efforts lead emergence language allows efficient communication exhibits degree homonymy synonymy. although homonymy reduces communication efficiency, seems dynamical trap persists long, perhaps indefinite, time. hand, synonymy reduce efficiency communication, appears transient feature language. thus, model role synonymy decreases long-time limit becomes negligible. similar rareness synonymy observed present natural languages. role noise, distorts communicated words, also examined. although, general, noise reduces communication efficiency, also regroups words evenly distributed within available ""verbal"" space.",4 "planning learning. paper introduces framework planning learning agent given goal achieve environment whose behavior partially known agent. discuss tractability various plan-design processes. show large natural class planning learning systems, plan presented verified reasonable time. however, coming algorithmically plan, even simple classes systems apparently intractable. emphasize role off-line plan-design processes, show that, natural cases, verification (projection) part carried efficient algorithmic manner.",4 "rotation invariance neural network. rotation invariance translation invariance great values image recognition tasks. paper, bring new architecture convolutional neural network (cnn) named cyclic convolutional layer achieve rotation invariance 2-d symbol recognition. also get position orientation 2-d symbol network achieve detection purpose multiple non-overlap target. last least, architecture achieve one-shot learning cases using invariance.",4 "two-stage sampled learning theory distributions. focus distribution regression problem: regressing real-valued response probability distribution. although exist large number similarity measures distributions, little known generalization performance specific learning tasks. learning problems formulated distributions inherent two-stage sampled difficulty: practice samples sampled distributions observable, one build estimate similarities computed sets points. best knowledge, existing method consistency guarantees distribution regression requires kernel density estimation intermediate step (which suffers slow convergence issues high dimensions), domain distributions compact euclidean. paper, provide theoretical guarantees remarkably simple algorithmic alternative solve distribution regression problem: embed distributions reproducing kernel hilbert space, learn ridge regressor embeddings outputs. main contribution prove consistency technique two-stage sampled setting mild conditions (on separable, topological domains endowed kernels). given total number observations, derive convergence rates explicit function problem difficulty. special case, answer 15-year-old open question: establish consistency classical set kernel [haussler, 1999; gartner et. al, 2002] regression, cover recent kernels distributions, including due [christmann steinwart, 2010].",12 "hashing algorithms large-scale learning. paper, first demonstrate b-bit minwise hashing, whose estimators positive definite kernels, naturally integrated learning algorithms svm logistic regression. adopt simple scheme transform nonlinear (resemblance) kernel linear (inner product) kernel; hence large-scale problems solved extremely efficiently. method provides simple effective solution large-scale learning massive extremely high-dimensional datasets, especially data fit memory. compare b-bit minwise hashing vowpal wabbit (vw) algorithm (which related count-min (cm) sketch). interestingly, vw variances random projections. theoretical empirical comparisons illustrate usually $b$-bit minwise hashing significantly accurate (at storage) vw (and random projections) binary data. furthermore, $b$-bit minwise hashing combined vw achieve improvements terms training speed, especially $b$ large.",19 "cross-language framework word recognition spotting indic scripts. handwritten word recognition spotting low-resource scripts difficult sufficient training data available often expensive collecting data scripts. paper presents novel cross language platform handwritten word recognition spotting low-resource scripts training performed sufficiently large dataset available script (considered source script) testing done scripts (considered target script). training one source script testing another script reasonable result easy handwriting domain due complex nature handwriting variability among scripts. also difficult mapping source target characters appear cursive word images. proposed indic cross language framework exploits large resource dataset training uses recognizing spotting text target scripts sufficient amount training data available. since, indic scripts mostly written 3 zones, namely, upper, middle lower, employ zone-wise character (or component) mapping efficient learning purpose. performance cross-language framework depends extent similarity source target scripts. hence, devise entropy based script similarity score using source target character mapping provide feasibility cross language transcription. tested approach three indic scripts, namely, bangla, devanagari gurumukhi, corresponding results reported.",4 "supervised saliency map driven segmentation lesions dermoscopic images. lesion segmentation first step automatic melanoma recognition systems. deficiencies difficulties dermoscopic images make lesion segmentation intricate task e.g., hair occlusion, presence dark corners color charts, indistinct lesion borders, lesions touching image boundaries. order overcome problems, proposed supervised saliency detection method specially tailored dermoscopic images based discriminative regional feature integration (drfi) method. drfi method incorporates multi-level segmentation, regional contrast, property backgroundness descriptors, random forest regressor create saliency scores region image. improved saliency detection method, mdrfi, introduced features regional property descriptors proposed novel pseudo-background region boost performance. overall segmentation framework uses saliency map construct initial mask lesion thresholding post-processing operations. initial mask evolving level set framework fit better lesion boundaries. results evaluation experiments three public datasets show proposed segmentation method outperforms conventional state-of-the-art segmentation algorithms performance comparable recent deep convolutional neural networks based approaches.",4 "exploiting sparsity build efficient kernel based collaborative filtering top-n item recommendation. increasing availability implicit feedback datasets raised interest developing effective collaborative filtering techniques able deal asymmetrically unambiguous positive feedback ambiguous negative feedback. paper, propose principled kernel-based collaborative filtering method top-n item recommendation implicit feedback. present efficient implementation using linear kernel, show generalize kernels dot product family preserving efficiency. also investigate elements influence sparsity standard cosine kernel. analysis shows sparsity kernel strongly depends properties dataset, particular long tail distribution. compare method state-of-the-art algorithms achieving good results terms efficiency effectiveness.",4 "attention-set based metric learning video face recognition. face recognition made great progress development deep learning. however, video face recognition (vfr) still ongoing task due various illumination, low-resolution, pose variations motion blur. existing cnn-based vfr methods obtain feature vector single image simply aggregate features video, less consider correlations face images one video. paper, propose novel attention-set based metric learning (asml) method measure statistical characteristics image sets. promising generalized extension maximum mean discrepancy memory attention weighting. first, define effective distance metric image sets, explicitly minimizes intra-set distance maximizes inter-set distance simultaneously. second, inspired neural turing machine, memory attention weighting proposed adapt set-aware global contents. asml naturally integrated cnns, resulting end-to-end learning scheme. method achieves state-of-the-art performance task video face recognition three widely used benchmarks including youtubeface, youtube celebrities celebrity-1000.",4 "positive definite matrices s-divergence. positive definite matrices abound dazzling variety applications. ubiquity part attributed rich geometric structure: positive definite matrices form self-dual convex cone whose strict interior riemannian manifold. manifold view endowed ""natural"" distance function conic view not. nevertheless, drawing motivation conic view, introduce s-divergence ""natural"" distance-like function open cone positive definite matrices. motivate s-divergence via sequence results connect riemannian distance. particular, show (a) divergence square distance; (b) several geometric properties similar riemannian distance, though without computationally demanding. s-divergence even intriguing: although nonconvex, still compute matrix means medians using global optimality. complement results numerical experiments illustrating theorems optimization algorithm computing matrix medians.",12 "using natural language processing screen patients active heart failure: exploration hospital-wide surveillance. paper, proposed two different approaches, rule-based approach machine-learning based approach, identify active heart failure cases automatically analyzing electronic health records (ehr). rule-based approach, extracted cardiovascular data elements clinical notes matched patients different colors according heart failure condition using rules provided experts heart failure. achieved 69.4% accuracy 0.729 f1-score. machine learning approach, bigram clinical notes features, tried four different models svm linear kernel achieved best performance 87.5% accuracy 0.86 f1-score. also, classification comparison four different models, believe linear models fit better problem. combine machine-learning rule-based algorithms, enable hospital-wide surveillance active heart failure increased accuracy interpretability outputs.",4 "neural pca deep unsupervised learning. network supporting deep unsupervised learning presented. network autoencoder lateral shortcut connections encoder decoder level hierarchy. lateral shortcut connections allow higher levels hierarchy focus abstract invariant features. standard autoencoders analogous latent variable models single layer stochastic variables, proposed network analogous hierarchical latent variables models. learning combines denoising autoencoder denoising sources separation frameworks. layer network contributes cost function term measures distance representations produced encoder decoder. since training signals originate levels network, layers learn efficiently even deep networks. speedup offered cost terms higher levels hierarchy ability learn invariant features demonstrated experiments.",19 "image pixel fusion human face recognition. paper present technique fusion optical thermal face images based image pixel fusion approach. several factors, affect face recognition performance case visual images, illumination changes significant factor needs addressed. thermal images better handling illumination conditions consistent capturing texture details faces. factors like sunglasses, beard, moustache etc also play active role adding complicacies recognition process. fusion thermal visual images solution overcome drawbacks present individual thermal visual face images. fused images projected eigenspace projected images classified using radial basis function (rbf) neural network also multi-layer perceptron (mlp). experiments object tracking classification beyond visible spectrum (otcbvs) database benchmark thermal visual face images used. comparison experimental results show proposed approach performs significantly well recognizing face images success rate 96% 95.07% rbf neural network mlp respectively.",4 "god(s) know(s): developmental cross-cultural patterns children drawings. paper introduces novel approach data analysis designed needs specialists psychology religion. detect developmental cross-cultural patterns children's drawings god(s) supernatural agents. develop methods objectively evaluate empirical observations drawings respect to: (1) gravity center, (2) average intensities colors \emph{green} \emph{yellow}, (3) use different colors (palette) (4) visual complexity drawings. find statistically significant differences across ages countries gravity centers average intensities colors. findings support hypotheses experts raise new questions investigation.",4 "linear time natural evolution strategy non-separable functions. present novel natural evolution strategy (nes) variant, rank-one nes (r1-nes), uses low rank approximation search distribution covariance matrix. algorithm allows computation natural gradient cost linear dimensionality parameter space, excels solving high-dimensional non-separable problems, including best result date rosenbrock function (512 dimensions).",4 "integrating prosodic lexical cues automatic topic segmentation. present probabilistic model uses prosodic lexical cues automatic segmentation speech topically coherent units. propose two methods combining lexical prosodic information using hidden markov models decision trees. lexical information obtained speech recognizer, prosodic features extracted automatically speech waveforms. evaluate approach broadcast news corpus, using darpa-tdt evaluation metrics. results show prosodic model alone competitive word-based segmentation methods. furthermore, achieve significant reduction error combining prosodic word-based knowledge sources.",4 "stable recovery sparse vectors random sinusoidal feature maps. random sinusoidal features popular approach speeding kernel-based inference large datasets. prior inference stage, approach suggests performing dimensionality reduction first multiplying data vector random gaussian matrix, computing element-wise sinusoid. theoretical analysis shows collecting sufficient number features reliably used subsequent inference kernel classification regression. work, demonstrate mild increase dimension embedding, also possible reconstruct data vector random sinusoidal features, provided underlying data sparse enough. particular, propose numerically stable algorithm reconstructing data vector given nonlinear features, analyze sample complexity. algorithm extended types structured inverse problems, demixing pair sparse (but incoherent) vectors. support efficacy approach via numerical experiments.",19 "possibility neutrosophic soft sets applications decision making similarity measure. paper, concept possibility neutrosophic soft set operations defined, properties studied. application theory decision making investigated. also similarity measure two possibility neutrosophic soft sets introduced discussed. finally application similarity measure given select suitable person position firm.",4 "ensemble methods convex regression applications geometric programming based circuit design. convex regression promising area bridging statistical estimation deterministic convex optimization. new piecewise linear convex regression methods fast scalable, instability used approximate constraints objective functions optimization. ensemble methods, like bagging, smearing random partitioning, alleviate problem maintain theoretical properties underlying estimator. empirically examine performance ensemble methods prediction optimization, apply device modeling constraint approximation geometric programming based circuit design.",4 "combinatorial algorithm compute regularization paths. wide variety regularization methods, algorithms computing entire solution path developed recently. solution path algorithms compute solution one particular value regularization parameter entire path solutions, making selection optimal parameter much easier. currently used algorithms robust sense cannot deal general degenerate input. present new robust, generic method parametric quadratic programming. algorithm directly applies nearly machine learning applications, far every application required different algorithm. illustrate usefulness method applying low rank problem could solved existing path tracking methods, namely compute part-worth values choice based conjoint analysis, popular technique market research estimate consumers preferences class parameterized options.",4 "dvqa: understanding data visualizations via question answering. bar charts effective way humans convey information other, today's algorithms cannot parse them. existing methods fail faced minor variations appearance. here, present dvqa, dataset tests many aspects bar chart understanding question answering framework. unlike visual question answering (vqa), dvqa requires processing words answers unique particular bar chart. state-of-the-art vqa algorithms perform poorly dvqa, propose two strong baselines perform considerably better. work enable algorithms automatically extract semantic information vast quantities literature science, business, areas.",4 "toward robust diversity-based model detect changes context. able automatically quickly understand user context session main issue recommender systems. first step toward achieving goal, propose model observes real time diversity brought item relatively short sequence consultations, corresponding recent user history. model complexity constant time, generic since apply type items within online service (e.g. profiles, products, music tracks) application domain (e-commerce, social network, music streaming), long partial item descriptions. observation diversity level time allows us detect implicit changes. long term, plan characterize context, i.e. find common features among contiguous sub-sequence items two changes context determined model. allow us make context-aware privacy-preserving recommendations, explain users. ongoing research, first step consists studying robustness model detecting changes context. order so, use music corpus 100 users 210,000 consultations (number songs played global history). validate relevancy detections finding connections changes context events, ends session. course, events subset possible changes context, since might several contexts within session. altered quality corpus several manners, test performances model confronted sparsity different types items. results show model robust constitutes promising approach.",4 word segmentation micro-blog texts external lexicon heterogeneous data. paper describes system designed nlpcc 2016 shared task word segmentation micro-blog texts.,4 learning deep structure-preserving image-text embeddings. paper proposes method learning joint embeddings images text using two-branch neural network multiple layers linear projections followed nonlinearities. network trained using large margin objective combines cross-view ranking constraints within-view neighborhood structure preservation constraints inspired metric learning literature. extensive experiments show approach gains significant improvements accuracy image-to-text text-to-image retrieval. method achieves new state-of-the-art results flickr30k mscoco image-sentence datasets shows promise new task phrase localization flickr30k entities dataset.,4 "using atl define advanced flexible constraint model transformations. transforming constraint models important task re- cent constraint programming systems. user-understandable models defined modeling phase rewriting tuning manda- tory get solving-efficient models. propose new architecture al- lowing define bridges (modeling solver) languages implement model optimizations. architecture follows model- driven approach constraint modeling process seen set model transformations. among others, interesting feature def- inition transformations concept-oriented rules, i.e. based types model elements types organized hierarchy called metamodel.",4 "sentiment new york city: high resolution spatial temporal view. measuring public sentiment key task researchers policymakers alike. explosion available social media data allows time-sensitive geographically specific analysis ever before. paper analyze data micro-blogging site twitter generate sentiment map new york city. develop classifier specifically tuned 140-character twitter messages, tweets, using key words, phrases emoticons determine mood tweet. method, combined geotagging provided users, enables us gauge public sentiment extremely fine-grained spatial temporal scales. find public mood generally highest public parks lowest transportation hubs, locate areas strong sentiment cemeteries, medical centers, jail, sewage facility. sentiment progressively improves proximity times square. periodic patterns sentiment fluctuate daily weekly scale: positive tweets posted weekends weekdays, daily peak sentiment around midnight nadir 9:00 a.m. noon.",15 "image forgery localization based multi-scale convolutional neural networks. paper, propose utilize convolutional neural networks (cnns) segmentation-based multi-scale analysis locate tampered areas digital images. first, deal color input sliding windows different scales, unified cnn architecture designed. then, elaborately design training procedures cnns sampled training patches. set robust multi-scale tampering detectors based cnns, complementary tampering possibility maps generated. last least, segmentation-based method proposed fuse maps generate final decision map. exploiting benefits small-scale large-scale analyses, segmentation-based multi-scale analysis lead performance leap forgery localization cnns. numerous experiments conducted demonstrate effectiveness efficiency method.",4 "faster coordinate descent via adaptive importance sampling. coordinate descent methods employ random partial updates decision variables order solve huge-scale convex optimization problems. work, introduce new adaptive rules random selection updates. adaptive, mean selection rules based dual residual primal-dual gap estimates change iteration. theoretically characterize performance selection rules demonstrate improvements state-of-the-art, extend theory algorithms general convex objectives. numerical evidence hinge-loss support vector machines lasso confirm practice follows theory.",4 "semi-bounded rationality: model decision making. paper theory semi-bounded rationality proposed extension theory bounded rationality. particular, proposed decision making process involves two components correlation machine, estimates missing values, causal machine, relates cause effect. rational decision making involves using information almost always imperfect incomplete well intelligent machine human inconsistent make decisions. theory bounded rationality decision made irrespective fact information used incomplete imperfect human brain inconsistent thus decision made taken within bounds limitations. theory semi-bounded rationality, signal processing used filter noise outliers information correlation machine applied complete missing information artificial intelligence used make consistent decisions.",4 "deep architecture semantic parsing. many successful approaches semantic parsing build top syntactic analysis text, make use distributional representations statistical models match parses ontology-specific queries. paper presents novel deep learning architecture provides semantic parsing system union two neural models language semantics. allows generation ontology-specific queries natural language statements questions without need parsing, makes especially suitable grammatically malformed syntactically atypical text, tweets, well permitting development semantic parsers resource-poor languages.",4 "training convolutional neural network appearance-invariant place recognition. place recognition one challenging problems computer vision, become key part mobile robotics autonomous driving applications performing loop closure visual slam systems. moreover, difficulty recognizing revisited location increases appearance changes caused, instance, weather illumination variations, hinders long-term application algorithms real environments. paper present convolutional neural network (cnn), trained first time purpose recognizing revisited locations severe appearance changes, maps images low dimensional space euclidean distances represent place dissimilarity. order network learn desired invariances, train triplets images selected datasets present challenging variability visual appearance. triplets selected way two samples location third one taken different place. validate system extensive experimentation, demonstrate better performance state-of-art algorithms number popular datasets.",4 "development evaluation deep learning model protein-ligand binding affinity prediction. structure based ligand discovery one successful approaches augmenting drug discovery process. currently, notable shift towards machine learning (ml) methodologies aid procedures. deep learning recently gained considerable attention allows model ""learn"" extract features relevant task hand. developed novel deep neural network estimating binding affinity ligand-receptor complexes. complex represented 3d grid, model utilizes 3d convolution produce feature map representation, treating atoms proteins ligands manner. network tested casf ""scoring power"" benchmark astex diverse set outperformed classical scoring functions. model, together usage instructions examples, available git repository http://gitlab.com/cheminfibb/pafnucy",19 "deep learning conditional random fields-based depth estimation topographical reconstruction conventional endoscopy. colorectal cancer fourth leading cause cancer deaths worldwide second leading cause united states. risk colorectal cancer mitigated identification removal premalignant lesions optical colonoscopy. unfortunately, conventional colonoscopy misses 20% polyps removed, due part poor contrast lesion topography. imaging tissue topography colonoscopy difficult size constraints endoscope deforming mucosa. existing methods make geometric assumptions incorporate priori information, limits accuracy sensitivity. paper, present method avoids restrictions, using joint deep convolutional neural network-conditional random field (cnn-crf) framework. estimated depth used reconstruct topography surface colon single image. train unary pairwise potential functions crf cnn synthetic data, generated developing endoscope camera model rendering 100,000 images anatomically-realistic colon. validate approach real endoscopy images porcine colon, transferred synthetic-like domain, ground truth registered computed tomography measurements. cnn-crf approach estimates depths relative error 0.152 synthetic endoscopy images 0.242 real endoscopy images. show estimated depth maps used reconstructing topography mucosa conventional colonoscopy images. approach easily integrated existing endoscopy systems provides foundation improving computer-aided detection algorithms detection, segmentation classification lesions.",4 "deep multi-view spatial-temporal network taxi demand prediction. taxi demand prediction important building block enabling intelligent transportation systems smart city. accurate prediction model help city pre-allocate resources meet travel demand reduce empty taxis streets waste energy worsen traffic congestion. increasing popularity taxi requesting services uber didi chuxing (in china), able collect large-scale taxi demand data continuously. utilize big data improve demand prediction interesting critical real-world problem. traditional demand prediction methods mostly rely time series forecasting techniques, fail model complex non-linear spatial temporal relations. recent advances deep learning shown superior performance traditionally challenging tasks image classification learning complex features correlations large-scale data. breakthrough inspired researchers explore deep learning techniques traffic prediction problems. however, existing methods traffic prediction considered spatial relation (e.g., using cnn) temporal relation (e.g., using lstm) independently. propose deep multi-view spatial-temporal network (dmvst-net) framework model spatial temporal relations. specifically, proposed model consists three views: temporal view (modeling correlations future demand values near time points via lstm), spatial view (modeling local spatial correlation via local cnn), semantic view (modeling correlations among regions sharing similar temporal patterns). experiments large-scale real taxi demand data demonstrate effectiveness approach state-of-the-art methods.",4 "learning point count. paper proposes problem point-and-count test case break what-and-where deadlock. different traditional detection problem, goal discover key salient points way localize count number objects simultaneously. propose two alternatives, one counts first point, another works way around. fundamentally, pivot around whether solve ""what"" ""where"" first. evaluate performance dataset contains multiple instances class, demonstrating potentials synergies. experiences derive important insights explains much harder problem classification, including strong data bias inability deal object scales robustly state-of-art convolutional neural networks.",4 "automatic image de-fencing system. tourists wild-life photographers often hindered capturing cherished images videos fence limits accessibility scene interest. situation exacerbated growing concerns security public places need exists provide tool used post-processing fenced videos produce de-fenced image. several challenges problem, identify robust detection fence/occlusions estimating pixel motion background scenes filling fence/occlusions utilizing information multiple frames input video. work, aim build automatic post-processing tool efficiently rid input video occlusion artifacts like fences. work distinguished two major contributions. first introduction learning based technique detect fences patterns complicated backgrounds. second formulation objective function minimization loopy belief propagation fill-in fence pixels. observe grids histogram oriented gradients descriptor using support vector machines based classifier significantly outperforms detection accuracy texels lattice. present results experiments using several real-world videos demonstrate effectiveness proposed fence detection de-fencing algorithm.",4 "web-based question answering: decision-making perspective. describe investigation use probabilistic models cost-benefit analyses guide resource-intensive procedures used web-based question answering system. first provide overview research question-answering systems. then, present details askmsr, prototype web-based question answering system. discuss bayesian analyses quality answers generated system show endow system ability make decisions number queries issued search engine, given cost queries expected value query results refining ultimate answer. finally, review results set experiments.",4 "efficient marginal likelihood computation gaussian process regression. bayesian learning setting, posterior distribution predictive model arises trade-off prior distribution conditional likelihood observed data. distribution functions usually rely additional hyperparameters need tuned order achieve optimum predictive performance; operation efficiently performed empirical bayes fashion maximizing posterior marginal likelihood observed data. since score function optimization problem general characterized presence local optima, necessary resort global optimization strategies, require large number function evaluations. given evaluation usually computationally intensive badly scaled respect dataset size, maximum number observations treated simultaneously quite limited. paper, consider case hyperparameter tuning gaussian process regression. straightforward implementation posterior log-likelihood model requires o(n^3) operations every iteration optimization procedure, n number examples input dataset. derive novel set identities allow, initial overhead o(n^3), evaluation score function, well jacobian hessian matrices, o(n) operations. prove proposed identities, follow eigendecomposition kernel matrix, yield reduction several orders magnitude computation time hyperparameter optimization problem. notably, proposed solution provides computational advantages even respect state art approximations rely sparse kernel matrices.",19 "quantitative entropy study language complexity. study entropy chinese english texts, based characters case chinese texts based words languages. significant differences found languages different personal styles debating partners. entropy analysis points direction lower entropy, higher complexity. text analysis would applied individuals different styles, single individual different age, well different groups population.",4 "multi-objective design quantum circuits using genetic programming. quantum computing new way data processing based concept quantum mechanics. quantum circuit design process converting quantum gate series basic gates divided two general categories based decomposition composition. second group, using evolutionary algorithms especially genetic algorithms, multiplication matrix gates used achieve final characteristic quantum circuit. genetic programming subfield evolutionary computing computer programs evolve solve studied problems. past research done field quantum circuits design, one cost metrics (usually quantum cost) investigated. paper first time, multi-objective approach provided design quantum circuits using genetic programming considers depth cost nearest neighbor metrics addition quantum cost metric. another innovation article use two-step fitness function taking account equivalence global phase quantum gates. results show proposed method able find good answer short time.",4 "implementing test strategy advanced video acquisition processing architecture. paper presents aspects related test process advanced video system used remote ip surveillance. system based pentium compatible architecture using industrial standard pc104+. first overall architecture system presented, involving hardware software aspects. acquisition board developed special, nonstandard architecture, also briefly presented. main purpose research set coherent set procedures order test aspects video acquisition board. accomplish this, necessary set-up procedure two steps: stand alone video board test (functional test) in-system test procedure verifying compatibility os: linux windows. paper presents also results obtained using procedure.",4 "tensorizing generative adversarial nets. generative adversarial network (gan) variants demonstrate state-of-the-art performance class generative models. capture higher dimensional distributions, common learning procedure requires high computational complexity large number parameters. paper, present new generative adversarial framework representing layer tensor structure connected multilinear operations, aiming reduce number model parameters large factor preserving quality generalized performance. learn model, develop efficient algorithm alternating optimization mode connections. experimental results demonstrate model achieve high compression rate model parameters 40 times compared existing gan.",4 "adam: method stochastic optimization. introduce adam, algorithm first-order gradient-based optimization stochastic objective functions, based adaptive estimates lower-order moments. method straightforward implement, computationally efficient, little memory requirements, invariant diagonal rescaling gradients, well suited problems large terms data and/or parameters. method also appropriate non-stationary objectives problems noisy and/or sparse gradients. hyper-parameters intuitive interpretations typically require little tuning. connections related algorithms, adam inspired, discussed. also analyze theoretical convergence properties algorithm provide regret bound convergence rate comparable best known results online convex optimization framework. empirical results demonstrate adam works well practice compares favorably stochastic optimization methods. finally, discuss adamax, variant adam based infinity norm.",4 "contradiction detection rumorous claims. utilization social media material journalistic workflows increasing, demanding automated methods identification mis- disinformation. since textual contradiction across social media posts signal rumorousness, seek model claims twitter posts textually contradicted. identify two different contexts contradiction emerges: broader form observed across independently posted tweets specific form threaded conversations. define two scenarios differ terms central elements argumentation: claims conversation structure. design evaluate models two scenarios uniformly 3-way recognizing textual entailment tasks order represent claims conversation structure implicitly generic inference model, previous studies used explicit representation properties. address noisy text, classifiers use simple similarity features derived string part-of-speech level. corpus statistics reveal distribution differences features contradictory opposed non-contradictory tweet relations, classifiers yield state art performance.",4 "crossing dependencies really scarce?. syntactic structure sentence modelled tree, vertices correspond words edges indicate syntactic dependencies. claimed recurrently number edge crossings real sentences small. however, baseline null hypothesis lacking. quantify amount crossings real sentences compare predictions series baselines. conclude crossings really scarce real sentences. scarcity unexpected hubiness trees. indeed, real sentences close linear trees, potential number crossings maximized.",15 "heuristic algorithms obtaining polynomial threshold functions low densities. paper present several heuristic algorithms, including genetic algorithm (ga), obtaining polynomial threshold function (ptf) representations boolean functions (bfs) small number monomials. compare among algorithm oztop via computational experiments. results indicate heuristic algorithms find parsimonious representations compared non-heuristic ga-based algorithms.",4 "comparison multi-task convolutional neural network (mt-cnn) methods toxicity prediction. toxicity analysis prediction paramount importance human health environmental protection. existing computational methods built wide variety descriptors regressors, makes performance analysis difficult. example, deep neural network (dnn), successful approach many occasions, acts like black box offers little conceptual elegance physical understanding. present work constructs common set microscopic descriptors based established physical models charges, surface areas free energies assess performance multi-task convolutional neural network (mt-cnn) architectures approaches, including random forest (rf) gradient boosting decision tree (gbdt), equal footing. comparison also given convolutional neural network (cnn) non-convolutional deep neural network (dnn) algorithms. four benchmark toxicity data sets (i.e., endpoints) used evaluate various approaches. extensive numerical studies indicate present mt-cnn architecture able outperform state-of-the-art methods.",16 "coverless information hiding based generative model. new coverless image information hiding method based generative model proposed, feed secret image generative model database, generate meaning-normal independent image different secret image, then, generated image transmitted receiver fed generative model database generate another image visually secret image. need transmit meaning-normal image related secret image, achieve effect transmission secret image. first time propose coverless image information hiding method based generative model, compared traditional image steganography, transmitted image embed information secret image method, therefore, effectively resist steganalysis tools. experimental results show method high capacity, safety reliability.",4 "complexity normal form rewrite sequences associativity. complexity particular term-rewrite system considered: rule associativity (x*y)*z --> x*(y*z). algorithms exact calculations given longest shortest sequences applications --> result normal form (nf). shortest nf sequence term x always n-drm(x), n number occurrences * x drm(x) depth rightmost leaf x. longest nf sequence term length n(n-1)/2.",2 "multimodal recurrent neural networks information transfer layers indoor scene labeling. paper proposes new method called multimodal rnns rgb-d scene semantic segmentation. optimized classify image pixels given two input sources: rgb color channels depth maps. simultaneously performs training two recurrent neural networks (rnns) crossly connected information transfer layers, learnt adaptively extract relevant cross-modality features. rnn model learns representations previous hidden states transferred patterns rnns previous hidden states; thus, model-specific crossmodality features retained. exploit structure quad-directional 2d-rnns model short long range contextual information 2d input image. carefully designed various baselines efficiently examine proposed model structure. test multimodal rnns method popular rgb-d benchmarks show outperforms previous methods significantly achieves competitive results state-of-the-art works.",4 "computing web-scale topic models using asynchronous parameter server. topic models latent dirichlet allocation (lda) widely used information retrieval tasks ranging smoothing feedback methods tools exploratory search discovery. however, classical methods inferring topic models scale massive size today's publicly available web-scale data sets. state-of-the-art approaches rely custom strategies, implementations hardware facilitate asynchronous, communication-intensive workloads. present aps-lda, integrates state-of-the-art topic modeling cluster computing frameworks spark using novel asynchronous parameter server. advantages integration include convenient usage existing data processing pipelines eliminating need disk writes data kept memory start finish. goal outperform highly customized implementations, propose general high-performance topic modeling framework easily used today's data processing pipelines. compare aps-lda existing spark lda implementations show system can, 480-core cluster, process 135 times data 10 times topics without sacrificing model quality.",4 "evaluation output embeddings fine-grained image classification. image classification advanced significantly recent years availability large-scale image sets. however, fine-grained classification remains major challenge due annotation cost large numbers fine-grained categories. project shows compelling classification performance achieved categories even without labeled training data. given image class embeddings, learn compatibility function matching embeddings assigned higher score mismatching ones; zero-shot classification image proceeds finding label yielding highest joint compatibility score. use state-of-the-art image features focus different supervised attributes unsupervised output embeddings either derived hierarchies learned unlabeled text corpora. establish substantially improved state-of-the-art animals attributes caltech-ucsd birds datasets. encouragingly, demonstrate purely unsupervised output embeddings (learned wikipedia improved fine-grained text) achieve compelling results, even outperforming previous supervised state-of-the-art. combining different output embeddings, improve results.",4 "disentangling 3d pose dendritic cnn unconstrained 2d face alignment. heatmap regression used landmark localization quite now. methods use deep stack bottleneck modules heatmap classification stage, followed heatmap regression extract keypoints. paper, present single dendritic cnn, termed pose conditioned dendritic convolution neural network (pcd-cnn), classification network followed second modular classification network, trained end end fashion obtain accurate landmark points. following bayesian formulation, disentangle 3d pose face image explicitly conditioning landmark estimation pose, making different multi-tasking approaches. extensive experimentation shows conditioning pose reduces localization error making agnostic face pose. proposed model extended yield variable number landmark points hence broadening applicability datasets. instead increasing depth width network, train cnn efficiently mask-softmax loss hard sample mining achieve upto $15\%$ reduction error compared state-of-the-art methods extreme medium pose face images challenging datasets including aflw, afw, cofw ibug.",4 "langpro: natural language theorem prover. langpro automated theorem prover natural language (https://github.com/kovvalsky/langpro). given set premises hypothesis, able prove semantic relations them. prover based version analytic tableau method specially designed natural logic. proof procedure operates logical forms preserve linguistic expressions large extent. %this property makes logical forms easily obtainable syntactic trees. %, particular, combinatory categorial grammar derivation trees. nature proofs deductive transparent. fracas sick textual entailment datasets, prover achieves high results comparable state-of-the-art.",4 "deterministic mdps adversarial rewards bandit feedback. consider markov decision process deterministic state transition dynamics, adversarially generated rewards change arbitrarily round round, bandit feedback model decision maker observes rewards receives. setting, present novel efficient online decision making algorithm named marcopolo. mild assumptions structure transition dynamics, prove marcopolo enjoys regret o(t^(3/4)sqrt(log(t))) best deterministic policy hindsight. specifically, analysis rely stringent unichain assumption, dominates much previous work topic.",4 "trace norm regularization faster inference embedded speech recognition rnns. propose evaluate new techniques compressing speeding dense matrix multiplications found fully connected recurrent layers neural networks embedded large vocabulary continuous speech recognition (lvcsr). compression, introduce study trace norm regularization technique training low rank factored versions matrix multiplications. compared standard low rank training, show method leads good accuracy versus number parameter trade-offs used speed training large models. speedup, enable faster inference arm processors new open sourced kernels optimized small batch sizes, resulting 3x 7x speed ups widely used gemmlowp library. beyond lvcsr, expect techniques kernels generally applicable embedded neural networks large fully connected recurrent layers.",4 "exploring speech enhancement generative adversarial networks robust speech recognition. investigate effectiveness generative adversarial networks (gans) speech enhancement, context improving noise robustness automatic speech recognition (asr) systems. prior work demonstrates gans effectively suppress additive noise raw waveform speech signals, improving perceptual quality metrics; however technique justified context asr. work, conduct detailed study measure effectiveness gans enhancing speech contaminated additive reverberant noise. motivated recent advances image processing, propose operating gans log-mel filterbank spectra instead waveforms, requires less computation robust reverberant noise. gan enhancement improves performance clean-trained asr system noisy speech, falls short performance achieved conventional multi-style training (mtr). appending gan-enhanced features noisy inputs retraining, achieve 7% wer improvement relative mtr system.",4 "brain eeg time series selection: novel graph-based approach classification. brain electroencephalography (eeg) classification widely applied analyze cerebral diseases recent years. unfortunately, invalid/noisy eegs degrade diagnosis performance previously developed methods ignore necessity eeg selection classification. end, paper proposes novel maximum weight clique-based eeg selection approach, named mwceegs, map eeg selection searching maximum similarity-weighted cliques improved fr\'{e}chet distance-weighted undirected eeg graph simultaneously considering edge weights vertex weights. mwceegs improves classification performance selecting intra-clique pairwise similar inter-clique discriminative eegs similarity threshold $\delta$. experimental results demonstrate algorithm effectiveness compared state-of-the-art time series selection algorithms real-world eeg datasets.",4 "sequential dual deep learning shape texture features sketch recognition. recognizing freehand sketches high arbitrariness greatly challenging. existing methods either ignore geometric characteristics treat sketches handwritten characters fixed structural ordering. consequently, hardly yield high recognition performance even though sophisticated learning techniques employed. paper, propose sequential deep learning strategy combines shape texture features. coded shape descriptor exploited characterize geometry sketch strokes high flexibility, outputs constitutional neural networks (cnn) taken abstract texture feature. develop dual deep networks memorable gated recurrent units (grus), sequentially feed two types features dual networks, respectively. dual networks enable feature fusion another gated recurrent unit (gru), thus accurately recognize sketches invariant stroke ordering. experiments tu-berlin data set show method outperforms average human state-of-the-art algorithms even significant shape appearance variations occur.",4 "prediction advice unknown number experts. framework prediction expert advice, consider recently introduced kind regret bounds: bounds depend effective instead nominal number experts. contrast normalhedge bound, mainly depends effective number experts also weakly depends nominal one, obtain bound contain nominal number experts all. use defensive forecasting method introduce application defensive forecasting multivalued supermartingales.",4 "hnp3: hierarchical nonparametric point process modeling content diffusion social media. paper introduces novel framework modeling temporal events complex longitudinal dependency generated dependent sources. framework takes advantage multidimensional point processes modeling time events. intensity function proposed process mixture intensities, complexity grows complexity temporal patterns data. moreover, utilizes hierarchical dependent nonparametric approach model marks events. capabilities allow proposed model adapt temporal topical complexity according complexity data, makes suitable candidate real world scenarios. online inference algorithm also proposed makes framework applicable vast range applications. framework applied real world application, modeling diffusion contents networks. extensive experiments reveal effectiveness proposed framework comparison state-of-the-art methods.",19 "nonlinear information bottleneck. information bottleneck [ib] technique extracting information `input' random variable relevant predicting different 'output' random variable. ib works encoding input compressed 'bottleneck variable' output accurately decoded. ib difficult compute practice, mainly developed two limited cases: (1) discrete random variables small state spaces, (2) continuous random variables jointly gaussian distributed (in case encoding decoding maps linear). propose method perform ib general domains. approach applied discrete continuous inputs outputs, allows nonlinear encoding decoding maps. method uses novel upper bound ib objective, derived using non-parametric estimator mutual information variational approximation. show implement method using neural networks gradient-based optimization, demonstrate performance mnist dataset.",4 "sk_p: neural program corrector moocs. present novel technique automatic program correction moocs, capable fixing syntactic semantic errors without manual, problem specific correction strategies. given incorrect student program, generates candidate programs distribution likely corrections, checks candidate correctness test suite. key observation moocs many programs share similar code fragments, seq2seq neural network model, used natural-language processing task machine translation, modified trained recover fragments. experiment shows scheme correct 29% incorrect submissions out-performs state art approach requires manual, problem specific correction strategies.",4 "towards optimal learning chain graphs. paper, extend meek's conjecture (meek 1997) directed acyclic graphs chain graphs, prove extended conjecture true. specifically, prove chain graph h independence map independence model induced another chain graph g, (i) g transformed h sequence directed undirected edge additions feasible splits mergings, (ii) operation sequence h remains independence map independence model induced g. result important consequence learning chain graphs data proof meek's conjecture (chickering 2002) learning bayesian networks data: makes possible develop efficient asymptotically correct learning algorithms mild assumptions.",19 "disjunctive logic programs inheritance. paper proposes new knowledge representation language, called dlp<, extends disjunctive logic programming (with strong negation) inheritance. addition inheritance enhances knowledge modeling features language providing natural representation default reasoning exceptions. declarative model-theoretic semantics dlp< provided, shown generalize answer set semantics disjunctive logic programs. knowledge modeling features language illustrated encoding classical nonmonotonic problems dlp<. complexity dlp< analyzed, proving inheritance cause computational overhead, reasoning dlp< exactly complexity reasoning disjunctive logic programming. confirmed existence efficient translation dlp< plain disjunctive logic programming. using translation, advanced kr system supporting dlp< language implemented top dlv system subsequently integrated dlv.",4 "recurrent neural network encoder attention community question answering. apply general recurrent neural network (rnn) encoder framework community question answering (cqa) tasks. approach rely linguistic processing, applied different languages domains. improvements observed extend rnn encoders neural attention mechanism encourages reasoning entire sequences. deal practical issues data sparsity imbalanced labels, apply various techniques transfer learning multitask learning. experiments semeval-2016 cqa task show 10% improvement map score compared information retrieval-based approach, achieve comparable performance strong handcrafted feature-based method.",4 "dna reservoir computing: novel molecular computing approach. propose novel molecular computing approach based reservoir computing. reservoir computing, dynamical core, called reservoir, perturbed external input signal readout layer maps reservoir dynamics target output. computation takes place transformation input space high-dimensional spatiotemporal feature space created transient dynamics reservoir. readout layer combines features produce target output. show coupled deoxyribozyme oscillators act reservoir. show despite using three coupled oscillators, molecular reservoir computer could achieve 90% accuracy benchmark temporal problem.",4 "differentiable transition additive multiplicative neurons. existing approaches combine additive multiplicative neural units either use fixed assignment operations require discrete optimization determine function neuron perform. however, leads extensive increase computational complexity training procedure. present novel, parameterizable transfer function based mathematical concept non-integer functional iteration allows operation neuron performs smoothly and, importantly, differentiablely adjusted addition multiplication. allows decision addition multiplication integrated standard backpropagation training procedure.",4 "foundations brussels operational-realistic approach cognition. scientific community becoming interested research applies mathematical formalism quantum theory model human decision-making. paper, provide theoretical foundations quantum approach cognition developed brussels. foundations rest results two decade studies axiomatic operational-realistic approaches foundations quantum physics. deep analogies foundations physics cognition lead us investigate validity quantum theory general unitary framework cognitive processes, empirical success hilbert space models derived investigation provides strong theoretical confirmation validity. however, two situations cognitive realm, 'question order effects' 'response replicability', indicate even hilbert space framework could insufficient reproduce collected data. mean mentioned operational-realistic approach would incorrect, simply larger class measurements would force human cognition, extended quantum formalism may needed deal them. explain, recently derived 'extended bloch representation' quantum theory (and associated 'general tension-reduction' model) precisely provides extended formalism, remaining within unitary interpretative framework.",4 "adjuncts processing lexical rules. standard hpsg analysis germanic verb clusters explain observed narrow-scope readings adjuncts verb clusters. present extension hpsg analysis accounts systematic ambiguity scope adjuncts verb cluster constructions, treating adjuncts members subcat list. extension uses powerful recursive lexical rules, implemented complex constraints. show `delayed evaluation' techniques constraint-logic programming used process lexical rules.",2 "using quaternion's representation individuals swarm intelligence evolutionary computation. paper introduces novel idea representation individuals using quaternions swarm intelligence evolutionary algorithms. quaternions number system, extends complex numbers. successfully applied problems theoretical physics areas needing fast rotation calculations. propose application quaternions optimization, precisely, using quaternions representation individuals bat algorithm. preliminary results experiments optimizing test-suite consisting ten standard functions showed new algorithm significantly improved results original bat algorithm. moreover, obtained results comparable swarm intelligence evolutionary algorithms, like artificial bees colony, differential evolution. believe representation could also successfully applied swarm intelligence evolutionary algorithms.",4 "learning observe. process diagnosis involves learning state system various observations symptoms findings system. sophisticated bayesian (and other) algorithms developed revise maintain beliefs system observations made. nonetheless, diagnostic models tended ignore common sense reasoning exploited human diagnosticians; particular, one learn observations made, spirit conversational implicature. two concepts describe extract information observations made. first, symptoms, present, likely reported others. second, human diagnosticians expert systems economical data-gathering, searching first likely find symptoms present. thus, desirable bias toward reporting symptoms present. develop simple model concepts significantly improve diagnostic inference.",4 "skynet: efficient robust neural network training tool machine learning astronomy. present first public release generic neural network training algorithm, called skynet. efficient robust machine learning tool able train large deep feed-forward neural networks, including autoencoders, use wide range supervised unsupervised learning applications, regression, classification, density estimation, clustering dimensionality reduction. skynet uses `pre-training' method obtain set network parameters empirically shown close good solution, followed optimisation using regularised variant newton's method, level regularisation determined adjusted automatically; latter uses second-order derivative information improve convergence, without need evaluate store full hessian matrix, using fast approximate method calculate hessian-vector products. combination methods allows training complicated networks difficult optimise using standard backpropagation techniques. skynet employs convergence criteria naturally prevent overfitting, also includes fast algorithm estimating accuracy network outputs. utility flexibility skynet demonstrated application number toy problems, astronomical problems focusing recovery structure blurred noisy images, identification gamma-ray bursters, compression denoising galaxy images. skynet software, implemented standard ansi c fully parallelised using mpi, available http://www.mrao.cam.ac.uk/software/skynet/.",1 "item2vec: neural item embedding collaborative filtering. many collaborative filtering (cf) algorithms item-based sense analyze item-item relations order produce item similarities. recently, several works field natural language processing (nlp) suggested learn latent representation words using neural embedding algorithms. among them, skip-gram negative sampling (sgns), also known word2vec, shown provide state-of-the-art results various linguistics tasks. paper, show item-based cf cast framework neural word embedding. inspired sgns, describe method name item2vec item-based cf produces embedding items latent space. method capable inferring item-item relations even user information available. present experimental results demonstrate effectiveness item2vec method show competitive svd.",4 "assigning satisfaction values constraints: algorithm solve dynamic meta-constraints. model dynamic meta-constraints special activity constraints activate constraints. also meta-constraints range constraints. algorithm presented constraints assigned one five different satisfaction values, leads assignment domain values variables csp. outline model algorithm presented, followed initial results two problems: simple classic csp car configuration problem. algorithm shown perform backtracks per solution, overheads form historical records required implementation state.",4 alternative restart strategies cma-es. paper focuses restart strategy cma-es multi-modal functions. first alternative strategy proceeds decreasing initial step-size mutation doubling population size restart. second strategy adaptively allocates computational budget among restart settings bipop scheme. restart strategies validated bbob benchmark; generality also demonstrated independent real-world problem suite related spacecraft trajectory optimization.,4 "learning phrase representations using rnn encoder-decoder statistical machine translation. paper, propose novel neural network model called rnn encoder-decoder consists two recurrent neural networks (rnn). one rnn encodes sequence symbols fixed-length vector representation, decodes representation another sequence symbols. encoder decoder proposed model jointly trained maximize conditional probability target sequence given source sequence. performance statistical machine translation system empirically found improve using conditional probabilities phrase pairs computed rnn encoder-decoder additional feature existing log-linear model. qualitatively, show proposed model learns semantically syntactically meaningful representation linguistic phrases.",4 "deepmind control suite. deepmind control suite set continuous control tasks standardised structure interpretable rewards, intended serve performance benchmarks reinforcement learning agents. tasks written python powered mujoco physics engine, making easy use modify. include benchmarks several learning algorithms. control suite publicly available https://www.github.com/deepmind/dm_control . video summary tasks available http://youtu.be/raai4qzcybs .",4 "abstract machine typed feature structures. paper describes first step towards definition abstract machine linguistic formalisms based typed feature structures, hpsg. core design abstract machine given detail, including compilation process high-level specification language abstract machine language implementation abstract instructions. thus apply methods proved useful computer science study natural languages: grammar specified using formalism endowed operational semantics. currently, machine supports unification simple feature structures, unification sequences structures, cyclic structures disjunction.",2 "study clear sky models singapore. estimation total solar irradiance falling earth's surface important field solar energy generation forecasting. several clear-sky solar radiation models developed last decades. models based empirical distribution various geographical parameters; models consider various atmospheric effects solar energy estimation. paper, perform comparative analysis several popular clear-sky models, tropical region singapore. important countries like singapore, primarily focused reliable efficient solar energy generation. analyze compare three popular clear-sky models widely used literature. validate solar estimation results using actual solar irradiance measurements obtained collocated weather stations. finally conclude reliable clear sky model singapore, based clear sky days year.",15 preference via entrenchment. introduce simple generalization gardenfors makinson's epistemic entrenchment called partial entrenchment. show preferential inference generated sceptical counterpart inference mechanism defined directly partial entrenchment.,4 "memory-based control recurrent neural networks. partially observed control problems challenging aspect reinforcement learning. extend two related, model-free algorithms continuous control -- deterministic policy gradient stochastic value gradient -- solve partially observed domains using recurrent neural networks trained backpropagation time. demonstrate approach, coupled long-short term memory able solve variety physical control problems exhibiting assortment memory requirements. include short-term integration information noisy sensors identification system parameters, well long-term memory problems require preserving information many time steps. also demonstrate success combined exploration memory problem form simplified version well-known morris water maze task. finally, show approach deal high-dimensional observations learning directly pixels. find recurrent deterministic stochastic policies able learn similarly good solutions tasks, including water maze agent must learn effective search strategies.",4 "linking search space structure, run-time dynamics, problem difficulty: step toward demystifying tabu search. tabu search one effective heuristics locating high-quality solutions diverse array np-hard combinatorial optimization problems. despite widespread success tabu search, researchers poor understanding many key theoretical aspects algorithm, including models high-level run-time dynamics identification search space features influence problem difficulty. consider questions context job-shop scheduling problem (jsp), domain tabu search algorithms shown remarkably effective. previously, demonstrated mean distance random local optima nearest optimal solution highly correlated problem difficulty well-known tabu search algorithm jsp introduced taillard. paper, discuss various shortcomings measure develop new model problem difficulty corrects deficiencies. show taillards algorithm modeled high fidelity simple variant straightforward random walk. random walk model accounts nearly variability cost required locate optimal sub-optimal solutions random jsps, provides explanation differences difficulty random versus structured jsps. finally, discuss empirically substantiate two novel predictions regarding tabu search algorithm behavior. first, method constructing initial solution highly unlikely impact performance tabu search. second, tabu tenure selected small possible simultaneously avoiding search stagnation; values larger necessary lead significant degradations performance.",4 "large kernel matters -- improve semantic segmentation global convolutional network. one recent trends [30, 31, 14] network architec- ture design stacking small filters (e.g., 1x1 3x3) entire network stacked small filters ef- ficient large kernel, given computational complexity. however, field semantic segmenta- tion, need perform dense per-pixel prediction, find large kernel (and effective receptive field) plays important role perform clas- sification localization tasks simultaneously. following design principle, propose global convolutional network address classification localization issues semantic segmentation. also suggest residual-based boundary refinement refine ob- ject boundaries. approach achieves state-of-art perfor- mance two public benchmarks significantly outper- forms previous results, 82.2% (vs 80.2%) pascal voc 2012 dataset 76.9% (vs 71.8%) cityscapes dataset.",4 "unsupervised feature learning audio analysis. identifying acoustic events continuously streaming audio source interest many applications including environmental monitoring basic research. scenario neither different event classes known distinguishes one class another. therefore, unsupervised feature learning method exploration audio data presented paper. incorporates two following novel contributions: first, audio frame predictor based convolutional lstm autoencoder demonstrated, used unsupervised feature extraction. second, training method autoencoders presented, leads distinct features amplifying event similarities. comparison standard approaches, features extracted audio frame predictor trained novel approach show 13 % better results used classifier 36 % better results used clustering.",4 "learning structural weight uncertainty sequential decision-making. learning probability distributions weights neural networks (nns) recently proven beneficial many applications. bayesian methods, stein variational gradient descent (svgd), offer elegant framework reason nn model uncertainty. however, assuming independent gaussian priors individual nn weights (as often applied), svgd impose prior knowledge often structural information (dependence) among weights. propose efficient posterior learning structural weight uncertainty, within svgd framework, employing matrix variate gaussian priors nn parameters. investigate learned structural uncertainty sequential decision-making problems, including contextual bandits reinforcement learning. experiments several synthetic real datasets indicate superiority model, compared state-of-the-art methods.",19 "cnn-based spatial feature fusion algorithm hyperspectral imagery classification. shortage training samples remains one main obstacles applying artificial neural networks (ann) hyperspectral images classification. fuse spatial spectral information, pixel patches often utilized train model, may aggregate problem. existing works, ann model supervised center-loss (annc) introduced. training merely spectral information, annc yields discriminative spectral features suitable subsequent classification tasks. paper, cnn-based spatial feature fusion (csff) algorithm proposed, allows smart fusion spatial information spectral features extracted annc. critical part csff, cnn-based discriminant model introduced estimate whether two paring pixels belong class. testing stage, applying discriminant model pixel-pairs generated test pixel neighbors, local structure estimated represented customized convolutional kernel. spectral-spatial feature obtained convolutional operation estimated kernel corresponding spectral features within neighborhood. last, label test pixel predicted classifying resulting spectral-spatial feature. without increasing number training samples involving pixel patches training stage, csff framework achieves state-of-the-art declining $20\%-50\%$ classification failures experiments three well-known hyperspectral images.",4 "galileo: generalized low-entropy mixture model. present new method generating mixture models data categorical attributes. keys approach entropy-based density metric categorical space annealing high-entropy/low-density components initial state many components. pruning low-density components using entropy-based density allows galileo consistently find high-quality clusters optimal number clusters. galileo shown promising results range test datasets commonly used categorical clustering benchmarks. demonstrate scaling galileo linear number records dataset, making method suitable large categorical datasets.",19 "reconstructive sparse code transfer contour detection semantic labeling. frame task predicting semantic labeling sparse reconstruction procedure applies target-specific learned transfer function generic deep sparse code representation image. strategy partitions training two distinct stages. first, unsupervised manner, learn set generic dictionaries optimized sparse coding image patches. train multilayer representation via recursive sparse dictionary learning pooled codes output earlier layers. second, encode training images generic dictionaries learn transfer function optimizes reconstruction patches extracted annotated ground-truth given sparse codes corresponding image patches. test time, encode novel image using generic dictionaries reconstruct using transfer function. output reconstruction semantic labeling test image. applying strategy task contour detection, demonstrate performance competitive state-of-the-art systems. unlike almost prior work, approach obviates need form hand-designed features filters. illustrate general applicability, also show initial results semantic part labeling human faces. effectiveness approach opens new avenues research deep sparse representations. classifiers utilize representation novel manner. rather acting nodes deepest layer, attach nodes along slice multiple layers network order make predictions local patches. flexible combination generatively learned sparse representation discriminatively trained transfer classifiers extends notion sparse reconstruction encompass arbitrary semantic labeling tasks.",4 "unveiling link logical fallacies web persuasion. last decade human-computer interaction (hci) started focus attention forms persuasive interaction computer technologies goal changing users behavior attitudes according predefined direction. work, hypothesize strong connection logical fallacies (forms reasoning logically invalid cognitively effective) common persuasion strategies adopted within web technologies. aim empirically evaluating hypothesis, carried pilot study sample 150 e-commerce websites.",4 "contextual position-aware factorization machines sentiment classification. existing machine learning models achieved great success sentiment classification, typically explicitly capture sentiment-oriented word interaction, lead poor results fine-grained analysis snippet level (a phrase sentence). factorization machine provides possible approach learning element-wise interaction recommender systems, directly applicable task due inability model contexts word sequences. work, develop two position-aware factorization machines consider word interaction, context position information. information jointly encoded set sentiment-oriented word interaction vectors. compared traditional word embeddings, swi vectors explicitly capture sentiment-oriented word interaction simplify parameter learning. experimental results show comparable performance state-of-the-art methods document-level classification, benefit snippet/sentence-level sentiment analysis.",4 "mathematical programming strategies solving minimum common string partition problem. minimum common string partition problem np-hard combinatorial optimization problem applications computational biology. work propose first integer linear programming model solving problem. moreover, basis integer linear programming model develop deterministic 2-phase heuristic applicable larger problem instances. results show provenly optimal solutions obtained problem instances small medium size literature solving proposed integer linear programming model cplex. furthermore, new best-known solutions obtained considered problem instances literature. concerning heuristic, able show outperforms heuristic competitors related literature.",4 "2d geometric information really tell us 3d face shape?. face image contains geometric cues form configurational information contours used estimate 3d face shape. clear 3d reconstruction 2d points highly ambiguous constraints enforced, one might expect face-space constraint solves problem. show case geometric information ambiguous cue. two sources ambiguity. first that, within space 3d face shapes, flexibility modes remain parts face fixed. second occurs perspective projection result perspective transformation camera distance varies. two different faces, viewed different distances, give rise 2d geometry. demonstrate ambiguities, develop new algorithms fitting 3d morphable model 2d landmarks contours either orthographic perspective projection show compute flexibility modes cases. show fitting problems posed separable nonlinear least squares problem solved efficiently. provide quantitative qualitative evidence ambiguity exists synthetic data real images.",4 "robust subspace clustering via thresholding. problem clustering noisy incompletely observed high-dimensional data points union low-dimensional subspaces set outliers considered. number subspaces, dimensions, orientations assumed unknown. propose simple low-complexity subspace clustering algorithm, applies spectral clustering adjacency matrix obtained thresholding correlations data points. words, adjacency matrix constructed nearest neighbors data point spherical distance. statistical performance analysis shows algorithm exhibits robustness additive noise succeeds even subspaces intersect. specifically, results reveal explicit tradeoff affinity subspaces tolerable noise level. furthermore prove algorithm succeeds even data points incompletely observed number missing entries allowed (up log-factor) linear ambient dimension. also propose simple scheme provably detects outliers, present numerical results real synthetic data.",19 "artificial neoteny evolutionary image segmentation. neoteny, also spelled paedomorphosis, defined biological terms retention organism juvenile even larval traits later life. species, morphological development retarded; organism juvenilized sexually mature. shifts reproductive capability would appear adaptive significance organisms exhibit it. terms evolutionary theory, process paedomorphosis suggests larval stages developmental phases existing organisms may give rise, certain circumstances, wholly new organisms. although present work pretend model simulate biological details concept way, ideas incorporated rather simple abstract computational strategy, order allow (if possible) faster convergence simple non-memetic genetic algorithms, i.e. without using local improvement procedures (e.g. via baldwin lamarckian learning). case-study, genetic algorithm used colour image segmentation purposes using k-mean unsupervised clustering methods, namely guiding evolutionary algorithm search finding optimal sub-optimal data partition. average results suggest use neotonic strategies employing juvenile genotypes later generations use linear-dynamic mutation rates instead constant, increase fitness values 58% comparing classical genetic algorithms, independently starting population characteristics search space. keywords: genetic algorithms, artificial neoteny, dynamic mutation rates, faster convergence, colour image segmentation, classification, clustering.",4 "automated surgical skill assessment rmis training. purpose: manual feedback basic rmis training consume significant amount time expert surgeons' schedule prone subjectivity. vr-based training tasks generate automated score reports, mechanism generating automated feedback surgeons performing basic surgical tasks rmis training. paper, explore usage different holistic features automated skill assessment using robot kinematic data propose weighted feature fusion technique improving score prediction performance. methods: perform experiments publicly available jigsaws dataset evaluate four different types holistic features robot kinematic data - sequential motion texture (smt), discrete fourier transform (dft), discrete cosine transform (dct) approximate entropy (apen). features used skill classification exact skill score prediction. along using features individually, also evaluate performance using proposed weighted combination technique. results: results demonstrate holistic features outperform previous hmm based state-of-the-art methods skill classification jigsaws dataset. also, proposed feature fusion strategy significantly improves performance skill score predictions achieving 0.61 average spearman correlation coefficient. conclusions: holistic features capturing global information robot kinematic data successfully used evaluating surgeon skill basic surgical tasks da vinci robot. using framework presented potentially allow real time score feedback rmis training.",4 "correlation game unsupervised learning yields computational interpretations hebbian excitation, anti-hebbian inhibition, synapse elimination. much learned plasticity biological synapses empirical studies. hebbian plasticity driven correlated activity presynaptic postsynaptic neurons. synapses converge onto neuron often behave compete fixed resource; survive competition others eliminated. provide computational interpretations aspects synaptic plasticity, formulate unsupervised learning zero-sum game hebbian excitation anti-hebbian inhibition neural network model. game formalizes intuition hebbian excitation tries maximize correlations neurons inputs, anti-hebbian inhibition tries decorrelate neurons other. include model synaptic competition, enables neuron eliminate connections except strongly correlated inputs. empirical studies, show facilitates learning sensory features resemble parts objects.",4 "multi-domain collaborative filtering. collaborative filtering effective recommendation approach preference user item predicted based preferences users similar interests. big challenge using collaborative filtering methods data sparsity problem often arises user typically rates items hence rating matrix extremely sparse. paper, address problem considering multiple collaborative filtering tasks different domains simultaneously exploiting relationships domains. refer multi-domain collaborative filtering (mcf) problem. solve mcf problem, propose probabilistic framework uses probabilistic matrix factorization model rating problem domain allows knowledge adaptively transferred across different domains automatically learning correlation domains. also introduce link function different domains correct biases. experiments conducted several real-world applications demonstrate effectiveness methods compared representative methods.",4 "(1+$λ$) evolutionary algorithm self-adjusting mutation rate. propose new way self-adjust mutation rate population-based evolutionary algorithms discrete search spaces. roughly speaking, consists creating half offspring mutation rate twice current mutation rate half half current rate. mutation rate updated rate used subpopulation contains best offspring. analyze $(1+\lambda)$ evolutionary algorithm self-adjusting mutation rate optimizes onemax test function. prove dynamic version $(1+\lambda)$~ea finds optimum expected optimization time (number fitness evaluations) $o(n\lambda/\!\log\lambda+n\log n)$. time asymptotically smaller optimization time classic $(1+\lambda)$ ea. previous work shows performance best-possible among $\lambda$-parallel mutation-based unbiased black-box algorithms. result shows new way adjusting mutation rate find optimal dynamic parameter values fly. since adjustment mechanism simpler ones previously used adjusting mutation rate parameters itself, optimistic find applications.",4 "characterizing concept drift. machine learning models static, world dynamic, increasing online deployment learned models gives increasing urgency development efficient effective mechanisms address learning context non-stationary distributions, commonly called concept drift. however, key issue characterizing different types drift occur previously subjected rigorous definition analysis. particular, qualitative drift categorizations proposed, formally defined, quantitative descriptions required precise objective understanding learner performance existed. present first comprehensive framework quantitative analysis drift. supports development first comprehensive set formal definitions types concept drift. formal definitions clarify ambiguities identify gaps previous definitions, giving rise new comprehensive taxonomy concept drift types solid foundation research mechanisms detect address concept drift.",4 "one deep music representation rule all? : comparative analysis different representation learning strategies. inspired success deploying deep learning fields computer vision natural language processing, learning paradigm also found way field music information retrieval. order benefit deep learning effective, also efficient manner, deep transfer learning become common approach. approach, possible reuse output pre-trained neural network basis new, yet unseen learning task. underlying hypothesis initial new learning tasks show commonalities applied type data (e.g. music audio), generated deep representation data also informative new task. since, however, networks used generate deep representations trained using single initial learning task, validity hypothesis questionable arbitrary new learning task. paper present results investigation best ways generate deep representations data learning tasks music domain. conducted investigation via extensive empirical study involves multiple learning tasks, well multiple deep learning architectures varying levels information sharing tasks, order learn music representations. validate representations considering multiple unseen learning tasks evaluation. results experiments yield several insights approach design methods learning widely deployable deep data representations music domain.",4 "stage 4 validation satellite image automatic mapper lightweight computer program earth observation level 2 product generation, part 2 validation. european space agency (esa) defines earth observation (eo) level 2 product multispectral (ms) image corrected geometric, atmospheric, adjacency topographic effects, stacked scene classification map (scm) whose legend includes quality layers cloud cloud-shadow. esa eo level 2 product ever systematically generated ground segment. contribute toward filling information gap eo big sensory data esa eo level 2 product, stage 4 validation (val) shelf satellite image automatic mapper (siam) lightweight computer program prior knowledge based ms color naming conducted independent means. time-series annual web enabled landsat data (weld) image composites conterminous u.s. (conus) selected input dataset. annual siam weld maps conus validated comparison u.s. national land cover data (nlcd) 2006 map. test reference maps share spatial resolution spatial extent, map legends must harmonized. sake readability paper split two. previous part 1 theory provided multidisciplinary background priori color naming. present part 2 validation presents discusses stage 4 val results collected test siam weld map time series reference nlcd map original protocol wall wall thematic map quality assessment without sampling, test reference map legends differ agreement part 1. conclusions siam-weld maps instantiate level 2 scm product whose legend fao land cover classification system (lccs) taxonomy dichotomous phase (dp) level 1 vegetation/nonvegetation, level 2 terrestrial/aquatic superior lccs level.",4 "theory deep learning iib: optimization properties sgd. theory iib characterize mix theory experiments optimization deep convolutional networks stochastic gradient descent. main new result paper theoretical experimental evidence following conjecture sgd: sgd concentrates probability -- like classical langevin equation -- large volume, ""flat"" minima, selecting flat minimizers high probability also global minimizers",4 "niching archive-based gaussian estimation distribution algorithm via adaptive clustering. model-based evolutionary algorithm, estimation distribution algorithm (eda) possesses unique characteristics widely applied global optimization. however, traditional gaussian eda (geda) may suffer premature convergence high risk falling local optimum dealing multimodal problem. paper, first attempts improve performance geda utilizing historical solutions develops novel archive-based eda variant. use historical solutions enhances search efficiency eda large extent, also significantly reduces population size faster convergence could achieved. then, archive-based eda integrated novel adaptive clustering strategy solving multimodal optimization problems. taking advantage clustering strategy locating different promising areas powerful exploitation ability archive-based eda, resultant algorithm endowed strong capability finding multiple optima. verify efficiency proposed algorithm, tested set well-known niching benchmark problems compared several state-of-the-art niching algorithms. experimental results indicate proposed algorithm competitive.",4 "chemical reaction optimization set covering problem. set covering problem (scp) one representative combinatorial optimization problems, many practical applications. paper investigates development algorithm solve scp employing chemical reaction optimization (cro), general-purpose metaheuristic. tested wide range benchmark instances scp. simulation results indicate algorithm gives outstanding performance compared heuristics metaheuristics solving scp.",4 "sinkhorn distances: lightspeed computation optimal transportation distances. optimal transportation distances fundamental family parameterized distances histograms. despite appealing theoretical properties, excellent performance retrieval tasks intuitive formulation, computation involves resolution linear program whose cost prohibitive whenever histograms' dimension exceeds hundreds. propose work new family optimal transportation distances look transportation problems maximum-entropy perspective. smooth classical optimal transportation problem entropic regularization term, show resulting optimum also distance computed sinkhorn-knopp's matrix scaling algorithm speed several orders magnitude faster transportation solvers. also report improved performance classical optimal transportation distances mnist benchmark problem.",19 "bayesian deep convolutional encoder-decoder networks surrogate modeling uncertainty quantification. interested development surrogate models uncertainty quantification propagation problems governed stochastic pdes using deep convolutional encoder-decoder network similar fashion approaches considered deep learning image-to-image regression tasks. since normal neural networks data intensive cannot provide predictive uncertainty, propose bayesian approach convolutional neural nets. recently introduced variational gradient descent algorithm based stein's method scaled deep convolutional networks perform approximate bayesian inference millions uncertain network parameters. approach achieves state art performance terms predictive accuracy uncertainty quantification comparison approaches bayesian neural networks well techniques include gaussian processes ensemble methods even training data size relatively small. evaluate performance approach, consider standard uncertainty quantification benchmark problems including flow heterogeneous media defined terms limited data-driven permeability realizations. performance surrogate model developed good even though underlying structure shared input (permeability) output (flow/pressure) fields often case image-to-image regression models used computer vision problems. studies performed underlying stochastic input dimensionality $4,225$ uncertainty quantification methods fail. uncertainty propagation tasks considered predictive output bayesian statistics compared obtained monte carlo estimates.",15 "bayesian approach discovering truth conflicting sources data integration. practical data integration systems, common data sources integrated provide conflicting information entity. consequently, major challenge data integration derive complete accurate integrated records diverse sometimes conflicting sources. term challenge truth finding problem. observe sources generally reliable others, therefore good model source quality key solving truth finding problem. work, propose probabilistic graphical model automatically infer true records source quality without supervision. contrast previous methods, principled approach leverages generative process two types errors (false positive false negative) modeling two different aspects source quality. doing, also first approach designed merge multi-valued attribute types. method scalable, due efficient sampling-based inference algorithm needs iterations practice enjoys linear time complexity, even faster incremental variant. experiments two real world datasets show new method outperforms existing state-of-the-art approaches truth finding problem.",4 "stimont: core ontology multimedia stimuli description. affective multimedia documents images, sounds videos elicit emotional responses exposed human subjects. stimuli stored affective multimedia databases successfully used wide variety research psychology neuroscience areas related attention emotion processing. although important affective multimedia databases numerous deficiencies impair applicability. problems, brought forward paper, result low recall precision multimedia stimuli retrieval makes creating emotion elicitation procedures difficult labor-intensive. address issues new core ontology stimont introduced. stimont written owl-dl formalism extends w3c emotionml format expressive formal representation affective concepts, high-level semantics, stimuli document metadata elicited physiology. advantages ontology description affective multimedia stimuli demonstrated document retrieval experiment compared contemporary keyword-based querying methods. also, software tool intelligent stimulus generator retrieval affective multimedia construction stimuli sequences presented.",4 "fever: large-scale dataset fact extraction verification. unlike tasks despite recent interest, research textual claim verification hindered lack large-scale manually annotated datasets. paper introduce new publicly available dataset verification textual sources, fever: fact extraction verification. consists 185,441 claims generated altering sentences extracted wikipedia subsequently verified without knowledge sentence derived from. claims classified supported, refuted notenoughinfo annotators achieving 0.6841 fleiss $\kappa$. first two classes, annotators also recorded sentence(s) forming necessary evidence judgment. characterize challenge dataset presented, develop pipeline approach using baseline state-of-the-art components compare suitably designed oracles. best accuracy achieve labeling claim accompanied correct evidence 31.87%, ignore evidence achieve 50.91%. thus believe fever challenging testbed help stimulate progress claim verification textual sources.",4 "paraconsistency word puzzles. word puzzles problem representations logic languages received considerable attention last decade (ponnuru et al. 2004; shapiro 2011; baral dzifcak 2012; schwitter 2013). special interest problem generating representations directly natural language (nl) controlled natural language (cnl). interesting variation problem, best knowledge, scarcely explored variation context, input information inconsistent. situations, existing encodings word puzzles produce inconsistent representations break down. paper, bring well-known type paraconsistent logics, called annotated predicate calculus (apc) (kifer lozinskii 1992), bear problem. introduce new kind non-monotonic semantics apc, called consistency preferred stable models argue makes apc suitable platform dealing inconsistency word puzzles and, generally, nl sentences. also devise number general principles help user choose among different representations nl sentences, might seem equivalent but, fact, behave differently inconsistent information taken account. principles incorporated existing cnl translators, attempto controlled english (ace) (fuchs et al. 2008) peng light (white schwitter 2009). finally, show apc consistency preferred stable model semantics equivalently embedded asp preferences stable models, use embedding implement version apc clingo (gebser et al. 2011) asprin add-on (brewka et al. 2015).",4 "advances hyperspectral image classification: earth monitoring statistical learning methods. hyperspectral images show similar statistical properties natural grayscale color photographic images. however, classification hyperspectral images challenging high dimensionality pixels small number labeled examples typically available learning. peculiarities lead particular signal processing problems, mainly characterized indetermination complex manifolds. framework statistical learning gained popularity last decade. new methods presented account spatial homogeneity images, include user's interaction via active learning, take advantage manifold structure semisupervised learning, extract encode invariances, adapt classifiers image representations unseen yet similar scenes. tutuorial reviews main advances hyperspectral remote sensing image classification illustrative examples.",4 "distance-based confidence score neural network classifiers. reliable measurement confidence classifiers' predictions important many applications is, therefore, important part classifier design. yet, although deep learning received tremendous attention recent years, much progress made quantifying prediction confidence neural network classifiers. bayesian models offer mathematically grounded framework reason model uncertainty, usually come prohibitive computational costs. paper propose simple, scalable method achieve reliable confidence score, based data embedding derived penultimate layer network. investigate two ways achieve desirable embeddings, using either distance-based loss adversarial training. test benefits method used classification error prediction, weighting ensemble classifiers, novelty detection. tasks show significant improvement traditional, commonly used confidence scores.",4 "multi-path feedback recurrent neural network scene parsing. paper, consider scene parsing problem propose novel multi-path feedback recurrent neural network (mpf-rnn) parsing scene images. mpf-rnn enhance capability rnns modeling long-range context information multiple levels better distinguish pixels easy confuse. different feedforward cnns rnns single feedback, mpf-rnn propagates contextual features learned top layer \textit{multiple} weighted recurrent connections learn bottom features. better training mpf-rnn, propose new strategy considers accumulative loss multiple recurrent steps improve performance mpf-rnn parsing small objects. two novel components, mpf-rnn achieved significant improvement strong baselines (vgg16 res101) five challenging scene parsing benchmarks, including traditional siftflow, barcelona, camvid, stanford background well recently released large-scale ade20k.",4 "generating high-quality query suggestion candidates task-based search. address task generating query suggestions task-based search. current state art relies heavily suggestions provided major search engine. paper, solve task without reliance search engines. specifically, focus first step two-stage pipeline approach, dedicated generation query suggestion candidates. present three methods generating candidate suggestions apply multiple information sources. using purpose-built test collection, find methods able generate high-quality suggestion candidates.",4 "cumulative distribution networks derivative-sum-product algorithm. introduce new type graphical model called ""cumulative distribution network"" (cdn), expresses joint cumulative distribution product local functions. local function viewed providing evidence possible orderings, rankings, variables. interestingly, find conditional independence properties cdns quite different graphical models. also describe messagepassing algorithm efficiently computes conditional cumulative distributions. due unique independence properties cdn, messages general one-to-one correspondence messages exchanged standard algorithms, belief propagation. demonstrate application cdns structured ranking learning using previously-studied multi-player gaming dataset.",4 "zoom out-and-in network recursive training object proposal. paper, propose zoom-out-and-in network generating object proposals. utilize different resolutions feature maps network detect object instances various sizes. specifically, divide anchor candidates three clusters based scale size place feature maps distinct strides detect small, medium large objects, respectively. deeper feature maps contain region-level semantics help shallow counterparts identify small objects. therefore design zoom-in sub-network increase resolution high level features via deconvolution operation. high-level features high resolution combined merged low-level features detect objects. furthermore, devise recursive training pipeline consecutively regress region proposals training stage order match iterative regression testing stage. demonstrate effectiveness proposed method ilsvrc det ms coco datasets, algorithm performs better state-of-the-arts various evaluation metrics. also increases average precision around 2% detection system.",4 "face transfer generative adversarial network. face transfer animates facial performances character target video source actor. traditional methods typically based face modeling. propose end-to-end face transfer method based generative adversarial network. specifically, leverage cyclegan generate face image target character corresponding head pose facial expression source. order improve quality generated videos, adopt patchgan explore effect different receptive field sizes generated images.",4 "generating extractive summaries scientific paradigms. researchers scientists increasingly find position quickly understand large amounts technical material. goal effectively serve need using bibliometric text mining summarization techniques generate summaries scientific literature. show use citations produce automatically generated, readily consumable, technical extractive summaries. first propose c-lexrank, model summarizing single scientific articles based citations, employs community detection extracts salient information-rich sentences. next, extend experiments summarize set papers, cover scientific topic. generate extractive summaries set question answering (qa) dependency parsing (dp) papers, abstracts, citation sentences show citations unique information amenable creating summary.",4 "regression-based image alignment general object categories. gradient-descent methods exhibited fast reliable performance image alignment facial domain, largely ignored broader vision community. require image function smooth (numerically) differentiable -- properties hold pixel-based representations obeying natural image statistics, general classes non-linear feature transforms. show transforms dense sift incorporated lucas kanade alignment framework predicting descent directions via regression. enables robust matching instances general object categories whilst maintaining desirable properties lucas kanade capacity handle high-dimensional warp parametrizations fast rate convergence. present alignment results number objects imagenet, extension method unsupervised joint alignment objects corpus images.",4 "algorithm selection combinatorial search problems: survey. algorithm selection problem concerned selecting best algorithm solve given problem case-by-case basis. become especially relevant last decade, researchers increasingly investigating identify suitable existing algorithm solving problem instead developing new algorithms. survey presents overview work focusing contributions made area combinatorial search problems, algorithm selection techniques achieved significant performance improvements. unify organise vast literature according criteria determine algorithm selection systems practice. comprehensive classification approaches identifies analyses different directions algorithm selection approached. paper contrasts compares different methods solving problem well ways using solutions. closes identifying directions current future research.",4 "see tree lines: shazoo algorithm -- full version --. predicting nodes given graph fascinating theoretical problem applications several domains. since graph sparsification via spanning trees retains enough information making task much easier, trees important special case problem. although known predict nodes unweighted tree nearly optimal way, weighted case fully satisfactory algorithm available yet. fill hole introduce efficient node predictor, shazoo, nearly optimal weighted tree. moreover, show shazoo viewed common nontrivial generalization previous approaches unweighted trees weighted lines. experiments real-world datasets confirm shazoo performs well fully exploits structure input tree, gets close (and sometimes better than) less scalable energy minimization methods.",4 "affine-gradient based local binary pattern descriptor texture classiffication. present novel affine-gradient based local binary pattern (aglbp) descriptor texture classification. hard describe complicated texture using single type information, local binary pattern (lbp), utilizes sign information difference pixel local neighbors. descriptor three characteristics: 1) order make full use information contained texture, affine-gradient, different euclidean-gradient invariant affine transformation incorporated aglbp. 2) improved method proposed rotation invariance, depends reference direction calculating respect local neighbors. 3) feature selection method, considering statistical frequency intraclass variance training dataset, also applied reduce dimensionality descriptors. experiments three standard texture datasets, outex12, outex10 kth-tips2, conducted evaluate performance aglbp. results show proposed descriptor gets better performance comparing state-of-the-art rotation texture descriptors texture classification.",4 "bayesian matrix completion via adaptive relaxed spectral regularization. bayesian matrix completion studied based low-rank matrix factorization formulation promising results. however, little work done bayesian matrix completion based direct spectral regularization formulation. fill gap presenting novel bayesian matrix completion method based spectral regularization. order circumvent difficulties dealing orthonormality constraints singular vectors, derive new equivalent form relaxed constraints, leads us design adaptive version spectral regularization feasible bayesian inference. bayesian method requires parameter tuning infer number latent factors automatically. experiments synthetic real datasets demonstrate encouraging results rank recovery collaborative filtering, notably good results sparse matrices.",4 hybrid approach english-hindi name entity transliteration. machine translation (mt) research indian languages still infancy. much work done proper transliteration name entities domain. paper address issue. used english-hindi language pair experiments used hybrid approach. first processed english words using rule based approach extracts individual phonemes words applied statistical approach converts english equivalent hindi phoneme turn corresponding hindi word. approach attained 83.40% accuracy.,4 "quick energy-efficient bayesian computing binocular disparity using stochastic digital signals. reconstruction tridimensional geometry visual scene using binocular disparity information important issue computer vision mobile robotics, formulated bayesian inference problem. however, computation full disparity distribution advanced bayesian model usually intractable problem, proves computationally challenging even simple model. paper, show probabilistic hardware using distributed memory alternate representation data stochastic bitstreams solve problem high performance energy efficiency. put forward way express discrete probability distributions using stochastic data representations perform bayesian fusion using representations, show approach applied diparity computation. evaluate system using simulated stochastic implementation discuss possible hardware implementations architectures potential sensorimotor processing robotics.",4 "automated map reading: image based localisation 2-d maps using binary semantic descriptors. describe novel approach image based localisation urban environments using semantic matching images 2-d map. contrasts vast majority existing approaches use image image database matching. use highly compact binary descriptors represent semantic features locations, significantly increasing scalability compared existing methods potential greater invariance variable imaging conditions. approach also akin human map reading, making suited human-system interaction. binary descriptors indicate presence semantic features relating buildings road junctions discrete viewing directions. use cnn classifiers detect features images match descriptor estimates database location tagged descriptors derived 2-d map. isolation, descriptors sufficiently discriminative, concatenated sequentially along route, combination becomes highly distinctive allows localisation even using non-perfect classifiers. performance improved taking account left right turns route. experimental results obtained using google streetview openstreetmap data show approach considerable potential, achieving localisation accuracy around 85% using routes corresponding approximately 200 meters.",4 "rational competitive analysis. much work computer science adopted competitive analysis tool decision making uncertainty. work extend competitive analysis context multi-agent systems. unlike classical competitive analysis behavior agent's environment taken arbitrary, consider case agent's environment consists agents. agents usually obey (minimal) rationality constraints. leads definition rational competitive analysis. introduce concept rational competitive analysis, initiate study competitive analysis multi-agent systems. also discuss application rational competitive analysis context bidding games, well classical one-way trading problem.",4 "reasonable forced goal orderings use agenda-driven planning algorithm. paper addresses problem computing goal orderings, one longstanding issues ai planning. makes two new contributions. first, formally defines discusses two different goal orderings, called reasonable forced ordering. orderings defined simple strips operators well complex adl operators supporting negation conditional effects. complexity orderings investigated practical relevance discussed. secondly, two different methods compute reasonable goal orderings developed. one based planning graphs, investigates set actions directly. finally, shown ordering relations, derived given set goals g, used compute so-called goal agenda divides g ordered set subgoals. planner then, principle, use goal agenda plan increasing sets subgoals. lead exponential complexity reduction, solution complex planning problem found solving easier subproblems. since polynomial overhead caused goal agenda computation, potential exists dramatically speed planning algorithms demonstrate empirical evaluation, use method ipp planner.",4 "infinite variational autoencoder semi-supervised learning. paper presents infinite variational autoencoder (vae) whose capacity adapts suit input data. achieved using mixture model mixing coefficients modeled dirichlet process, allowing us integrate coefficients performing inference. critically, allows us automatically vary number autoencoders mixture based data. experiments show flexibility method, particularly semi-supervised learning, small number training samples available.",4 "text summarization using abstract meaning representation. ever increasing size text present internet, automatic summary generation remains important problem natural language understanding. work explore novel full-fledged pipeline text summarization intermediate step abstract meaning representation (amr). pipeline proposed us first generates amr graph input story, extracts summary graph finally, generate summary sentences summary graph. proposed method achieves state-of-the-art results compared text summarization routines based amr. also point significant problems existing evaluation methods, make unsuitable evaluating summary quality.",4 "monitoring chinese population migration consecutive weekly basis intra-city scale inter-province scale didi's bigdata. population migration valuable information leads proper decision urban-planning strategy, massive investment, many fields. instance, inter-city migration posterior evidence see government's constrain population works, inter-community immigration might prior evidence real estate price hike. timely data, also impossible compare city favorable people, suppose cities release different new regulations, could also compare customers different real estate development groups, come from, probably go. unfortunately data available. paper, leveraging data generated positioning team didi, propose novel approach timely monitoring population migration community scale provincial scale. migration detected soon week. could faster, setting week statistical purpose. monitoring system developed, applied nation wide china, observations derived system presented paper. new method migration perception origin insight nowadays people mostly moving personal access point (ap), also known wifi hotspot. assume ratio ap moving migration population constant, analysis comparative population migration would feasible. exact quantitative research would also done sample research model regression. procedures processing data includes many steps: eliminating impact pseudo-migration ap, instance pocket wifi, second-hand traded router; distinguishing moving population moving companies; identifying shifting ap finger print clusters, etc..",19 "semi-dense 3d semantic mapping monocular slam. bundle geometry appearance computer vision proven promising solution robots across wide variety applications. stereo cameras rgb-d sensors widely used realise fast 3d reconstruction trajectory tracking dense way. however, lack flexibility seamless switch different scaled environments, i.e., indoor outdoor scenes. addition, semantic information still hard acquire 3d mapping. address challenge combining state-of-art deep learning method semi-dense simultaneous localisation mapping (slam) based video stream monocular camera. approach, 2d semantic information transferred 3d mapping via correspondence connective keyframes spatial consistency. need obtain semantic segmentation frame sequence, could achieve reasonable computation time. evaluate method indoor/outdoor datasets lead improvement 2d semantic labelling baseline single frame predictions.",4 "improved eeg event classification using differential energy. feature extraction automatic classification eeg signals typically relies time frequency representations signal. techniques cepstral-based filter banks wavelets popular analysis techniques many signal processing applications including eeg classification. paper, present comparison variety approaches estimating postprocessing features. aid discrimination periodic signals aperiodic signals, add differential energy term. evaluate approaches tuh eeg corpus, largest publicly available eeg corpus exceedingly challenging task due clinical nature data. demonstrate variant standard filter bank-based approach, coupled first second derivatives, provides substantial reduction overall error rate. combination differential energy derivatives produces 24% absolute reduction error rate improves ability discriminate signal events background noise. relatively simple approach proves comparable popular feature extraction approaches wavelets, much computationally efficient.",6 "convolution aware initialization. initialization parameters deep neural networks shown big impact performance networks (mishkin & matas, 2015). initialization scheme devised et al, allowed convolution activations carry constrained mean allowed deep networks trained effectively (he et al., 2015a). orthogonal initializations generally orthogonal matrices standard recurrent networks proved eradicate vanishing exploding gradient problem (pascanu et al., 2012). majority current initialization schemes take fully account intrinsic structure convolution operator. using duality fourier transform convolution operator, convolution aware initialization builds orthogonal filters fourier space, using inverse fourier transform represents standard space. convolution aware initialization noticed higher accuracy lower loss, faster convergence. achieve new state art cifar10 dataset, achieve close state art various tasks.",4 "joint object category 3d pose estimation 2d images. 2d object detection task finding (i) objects present image (ii) located, 3d pose estimation task finding pose objects 3d space. state-of-the-art methods solving tasks follow two-stage approach 3d pose estimation system applied bounding boxes (with associated category labels) returned 2d detection method. paper addresses task joint object category 3d pose estimation given 2d bounding box. design residual network based architecture solve two seemingly orthogonal tasks new category-dependent pose outputs loss functions, show state-of-the-art performance challenging pascal3d+ dataset.",4 "fixing error caponnetto de vito (2007). seminal paper caponnetto de vito (2007) provides minimax-optimal rates kernel ridge regression general setting. proof, however, contains error bound effective dimensionality. note, explain mistake, provide correct bound, show main theorem remains true.",19 "one size fits many: column bundle multi-x learning. much recent machine learning research directed towards leveraging shared statistics among labels, instances data views, commonly referred multi-label, multi-instance multi-view learning. underlying premises exist correlations among input parts among output targets, predictive performance would increase correlations incorporated. paper, propose column bundle (clb), novel deep neural network capturing shared statistics data. clb generic architecture applied various types shared statistics changing input output handling. clb capable scaling thousands input parts output labels avoiding explicit modeling pairwise relations. evaluate clb different types data: (a) multi-label, (b) multi-view, (c) multi-view/multi-label (d) multi-instance. clb demonstrates comparable competitive performance datasets state-of-the-art methods designed specifically type.",19 "ranking pages topology popularity within web sites. compare two link analysis ranking methods web pages site. first, called site rank, adaptation pagerank granularity web site second, called popularity rank, based frequencies user clicks outlinks page captured navigation sessions users web site. ran experiments artificially created web sites different sizes two real data sets, employing relative entropy compare distributions two ranking methods. real data sets also employ nonparametric measure, called spearman's footrule, use compare top-ten web pages ranked two methods. main result distributions popularity rank site rank surprisingly close other, implying topology web site instrumental guiding users site. thus, practice, site rank provides reasonable first order approximation aggregate behaviour users within web site given popularity rank.",4 "kernel method detecting higher order interactions multi-view data: application imaging, genetics, epigenetics. study, tested interaction effect multimodal datasets using novel method called kernel method detecting higher order interactions among biologically relevant mulit-view data. using semiparametric method reproducing kernel hilbert space (rkhs), used standard mixed-effects linear model derived score-based variance component statistic tests higher order interactions multi-view data. proposed method offers intangible framework identification higher order interaction effects (e.g., three way interaction) genetics, brain imaging, epigenetic data. extensive numerical simulation studies first conducted evaluate performance method. finally, method evaluated using data mind clinical imaging consortium (mcic) including single nucleotide polymorphism (snp) data, functional magnetic resonance imaging (fmri) scans, deoxyribonucleic acid (dna) methylation data, respectfully, schizophrenia patients healthy controls. treated gene-derived snps, region interest (roi) gene-derived dna methylation single testing unit, combined triplets evaluation. addition, cardiovascular disease risk factors age, gender, body mass index assessed covariates hippocampal volume compared triplets. method identified $13$-triplets ($p$-values $\leq 0.001$) included $6$ gene-derived snps, $10$ rois, $6$ gene-derived dna methylations correlated changes hippocampal volume, suggesting triplets may important explaining schizophrenia-related neurodegeneration. strong evidence ($p$-values $\leq 0.000001$), triplet ({\bf magi2, crblcrus1.l, fbxo28}) potential distinguish schizophrenia patients healthy control variations.",19 "urban ozone concentration forecasting artificial neural network corsica. atmospheric pollutants concentration forecasting important issue air quality monitoring. qualitair corse, organization responsible monitoring air quality corsica (france) region, needs develop short-term prediction model lead mission information towards public. various deterministic models exist meso-scale local forecasting, need powerful large variable sets, good knowledge atmospheric processes, inaccurate local climatical geographical particularities, observed corsica, mountainous island located mediterranean sea. result, focus study statistical models, particularly artificial neural networks (ann) shown good results prediction ozone concentration horizon h+1 data measured locally. purpose study build predictor realize predictions ozone pm10 horizon d+1 corsica order able anticipate pollution peak formation take appropriated prevention measures. specific meteorological conditions known lead particular pollution event corsica (e.g. saharan dust event). therefore, several ann models used, meteorological conditions clustering operational forecasting.",4 "unsupervised dynamic image segmentation using fuzzy hopfield neural network based genetic algorithm. paper proposes genetic algorithm based segmentation method automatically segment gray-scale images. proposed method mainly consists spatial unsupervised grayscale image segmentation divides image regions. aim algorithm produce precise segmentation images using intensity information along neighborhood relationships. paper, fuzzy hopfield neural network (fhnn) clustering helps generating population genetic algorithm automatically segments image. technique powerful method image segmentation works single multiple-feature data spatial information. validity index utilized introducing robust technique finding optimum number components image. experimental results shown algorithm generates good quality segmented image.",4 "intrusion detection using continuous time bayesian networks. intrusion detection systems (idss) fall two high-level categories: network-based systems (nids) monitor network behaviors, host-based systems (hids) monitor system calls. work, present general technique systems. use anomaly detection, identifies patterns conforming historic norm. types systems, rates change vary dramatically time (due burstiness) components (due service difference). efficiently model systems, use continuous time bayesian networks (ctbns) avoid specifying fixed update interval common discrete-time models. build generative models normal training data, abnormal behaviors flagged based likelihood norm. nids, construct hierarchical ctbn model network packet traces use rao-blackwellized particle filtering learn parameters. illustrate power method experiments detecting real worms identifying hosts two publicly available network traces, mawi dataset lbnl dataset. hids, develop novel learning method deal finite resolution system log file time stamps, without losing benefits continuous time model. demonstrate method detecting intrusions darpa 1998 bsm dataset.",4 "distributed evolutionary k-way node separators. computing high quality node separators large graphs necessary variety applications, ranging divide-and-conquer algorithms vlsi design. work, present novel distributed evolutionary algorithm tackling k-way node separator problem. key component contribution includes new k-way local search algorithms based maximum flows. combine local search multilevel approach compute initial population evolutionary algorithm, show modify coarsening stage multilevel algorithm create effective combine mutation operations. lastly, combine techniques scalable communication protocol, producing system able compute high quality solutions short amount time. experiments competing algorithms show advanced evolutionary algorithm computes best result 94% chosen benchmark instances.",4 "instance similarity deep hashing multi-label image retrieval. hash coding widely used approximate nearest neighbor search large-scale image retrieval. recently, many deep hashing methods proposed shown largely improved performance traditional feature-learning-based methods. methods examine pairwise similarity semantic-level labels, pairwise similarity generally defined hard-assignment way. is, pairwise similarity '1' share less one class label '0' share any. however, similarity definition cannot reflect similarity ranking pairwise images hold multiple labels. paper, new deep hashing method proposed multi-label image retrieval re-defining pairwise similarity instance similarity, instance similarity quantified percentage based normalized semantic labels. based instance similarity, weighted cross-entropy loss minimum mean square error loss tailored loss-function construction, efficiently used simultaneous feature learning hash coding. experiments three popular datasets demonstrate that, proposed method outperforms competing methods achieves state-of-the-art performance multi-label image retrieval.",4 "comparison several reweighted l1-algorithms solving cardinality minimization problems. reweighted l1-algorithms attracted lot attention field applied mathematics. unified framework algorithms recently proposed zhao li. paper construct new examples reweighted l1-methods. functions certain concave approximations l0-norm function. focus numerical comparison new existing reweighted l1-algorithms. show change parameters reweighted algorithms may affect performance algorithms finding solution cardinality minimization problem. experiments, problem data generated according different statistical distributions, test algorithms different sparsity level solution problem. numerical results demonstrate reweighted l1-method one efficient methods locating solution cardinality minimization problem.",12 "learning compare: relation network few-shot learning. present conceptually simple, flexible, general framework few-shot learning, classifier must learn recognise new classes given examples each. method, called relation network (rn), trained end-to-end scratch. meta-learning, learns learn deep distance metric compare small number images within episodes, designed simulate few-shot setting. trained, rn able classify images new classes computing relation scores query images examples new class without updating network. besides providing improved performance few-shot learning, framework easily extended zero-shot learning. extensive experiments four datasets demonstrate simple approach provides unified effective approach two tasks.",4 "decision-theoretic planning concurrent temporally extended actions. investigate model planning uncertainty temporallyextended actions, multiple actions taken concurrently decision epoch. model based options framework, combines factored state space models,where set options partitioned classes affectdisjoint state variables. show set decisionepochs concurrent options defines semi-markov decisionprocess, underlying temporally extended actions parallelized arerestricted markov options. property allows us use smdpalgorithms computing value function concurrentoptions. concurrent options model allows overlapping execution ofoptions order achieve higher performance order performa complex task. describe simple experiment using navigationtask illustrates concurrent options results faster planwhen compared case one option taken time.",4 "correlation clustering noisy partial information. paper, propose study semi-random model correlation clustering problem arbitrary graphs g. give two approximation algorithms correlation clustering instances model. first algorithm finds solution value $(1+ \delta) optcost + o_{\delta}(n\log^3 n)$ high probability, $optcost$ value optimal solution (for every $\delta > 0$). second algorithm finds ground truth clustering arbitrarily small classification error $\eta$ (under additional assumptions instance).",4 "particle filtering audio localization manifold. present novel particle filtering algorithm tracking moving sound source using microphone array. n microphones array, track $n \choose 2$ delays single particle filter time. since known tracking high dimensions rife difficulties, instead integrate particle filter model low dimensional manifold delays lie on. manifold model based work modeling low dimensional manifolds via random projection trees [1]. addition, also introduce new weighting scheme particle filtering algorithm based recent advancements online learning. show novel tdoa tracking algorithm integrates manifold model greatly outperform standard particle filters audio tracking task.",4 "effective feature selection method based pair-wise feature proximity high dimensional low sample size data. feature selection studied widely literature. however, efficacy selection criteria low sample size applications neglected cases. existing feature selection criteria based sample similarity. however, distance measures become insignificant high dimensional low sample size (hdlss) data. moreover, variance feature samples pointless unless represents data distribution efficiently. instead looking samples groups, evaluate efficiency based pairwise fashion. investigation, noticed considering pair samples time selecting features bring closer put far away better choice feature selection. experimental results benchmark data sets demonstrate effectiveness proposed method low sample size, outperforms many state-of-the-art feature selection methods.",4 "progressive versus random projections compressive capture images, lightfields higher dimensional visual signals. computational photography involves sophisticated capture methods. new trend capture projection higher dimensional visual signals videos, multi-spectral data lightfields lower dimensional sensors. carefully designed capture methods exploit sparsity underlying signal transformed domain reduce number measurements use appropriate reconstruction method. traditional progressive methods may capture successively detail using sequence simple projection basis, dct wavelets employ straightforward backprojection reconstruction. randomized projection methods use specific sequence use l0 minimization reconstruction. paper, analyze statistical properties natural images, videos, multi-spectral data light-fields compare effectiveness progressive random projections. define effectiveness plotting reconstruction snr compression factor. key idea procedure measure best-case effectiveness fast, independent specific hardware independent reconstruction procedure. believe first empirical study compare different lossy capture strategies without complication hardware reconstruction ambiguity. scope limited linear non-adaptive sensing. results show random projections produce significant advantages projections higher dimensional signals, suggest research nascent adaptive non-linear projection methods.",4 "multi-level anomaly detection time-varying graph data. work presents novel modeling analysis framework graph sequences addresses challenge detecting contextualizing anomalies labelled, streaming graph data. introduce generalization bter model seshadhri et al. adding flexibility community structure, use model perform multi-scale graph anomaly detection. specifically, probability models describing coarse subgraphs built aggregating probabilities finer levels, closely related hierarchical models simultaneously detect deviations expectation. technique provides insight graph's structure internal context may shed light detected event. additionally, multi-scale analysis facilitates intuitive visualizations allowing users narrow focus anomalous graph particular subgraphs nodes causing anomaly. evaluation, two hierarchical anomaly detectors tested baseline gaussian method series sampled graphs. demonstrate graph statistics-based approach outperforms distribution-based detector baseline labeled setting community structure, accurately detects anomalies synthetic real-world datasets node, subgraph, graph levels. illustrate accessibility information made possible via technique, anomaly detector associated interactive visualization tool tested ncaa football data, teams conferences moved within league identified perfect recall, precision greater 0.786.",4 "multiple object tracking context awareness. multiple people tracking key problem many applications surveillance, animation car navigation, key input tasks activity recognition. crowded environments occlusions false detections common, although substantial advances recent years, tracking still challenging task. tracking typically divided two steps: detection, i.e., locating pedestrians image, data association, i.e., linking detections across frames form complete trajectories. data association task, approaches typically aim developing new, complex formulations, turn put focus optimization techniques required solve them. however, still utilize basic information distance detections. thesis, focus data association task argue contextual information fully exploited yet tracking community, mainly social context spatial context coming different views.",4 kernel diff-hash. paper presents kernel formulation recently introduced diff-hash algorithm construction similarity-sensitive hash functions. kernel diff-hash algorithm shows superior performance problem image feature descriptor matching.,4 "extractive summarization: limits, compression, generalized model heuristics. due promise alleviate information overload, text summarization attracted attention many researchers. however, remained serious challenge. here, first prove empirical limits recall (and f1-scores) extractive summarizers duc datasets rouge evaluation single-document multi-document summarization tasks. next define concept compressibility document present new model summarization, generalizes existing models literature integrates several dimensions summarization, viz., abstractive versus extractive, single versus multi-document, syntactic versus semantic. finally, examine new existing single-document summarization algorithms single framework compare state art summarizers duc data.",4 "language models image captioning: quirks works. two recent approaches achieved state-of-the-art results image captioning. first uses pipelined process set candidate words generated convolutional neural network (cnn) trained images, maximum entropy (me) language model used arrange words coherent sentence. second uses penultimate activation layer cnn input recurrent neural network (rnn) generates caption sequence. paper, compare merits different language modeling approaches first time using state-of-the-art cnn input. examine issues different approaches, including linguistic irregularities, caption repetition, data set overlap. combining key aspects rnn methods, achieve new record performance previously published results benchmark coco dataset. however, gains see bleu translate human judgments.",4 "ambiguity language networks. human language defines complex outcomes evolution. emergence elaborated form communication allowed humans create extremely structured societies manage symbols different levels including, among others, semantics. linguistic levels deal astronomic combinatorial potential stems recursive nature languages. recursiveness indeed key defining trait. however, words equally combined frequent. breaking symmetry less often used less meaning-bearing units, universal scaling laws arise. laws, common human languages, appear different stages word inventories networks interacting words. among seemingly universal traits exhibited language networks, ambiguity appears specially relevant component. ambiguity avoided computational approaches language processing, yet seems crucial element language architecture. review evidence language network architecture theoretical reasonings based least effort argument. ambiguity shown play essential role providing source language efficiency, likely inevitable byproduct network growth.",15 "linguistic descriptions data help teaching-learning process higher education, case study: artificial intelligence. artificial intelligence central topic computer science curriculum. year 2011 project-based learning methodology based computer games designed implemented intelligence artificial course university bio-bio. project aims develop software-controlled agents (bots) programmed using heuristic algorithms seen course. methodology allows us obtain good learning results, however several challenges founded implementation. paper show linguistic descriptions data help provide students teachers technical personalized feedback learned algorithms. algorithm behavior profile new turing test computer games bots based linguistic modelling complex phenomena also proposed order deal challenges. order show explore possibilities new technology, web platform designed implemented one authors incorporation process assessment allows us improve teaching learning process.",4 "dynamic controllability conditional stns uncertainty. recent attempts automate business processes medical-treatment processes uncovered need formal framework accommodate temporal constraints, also observations actions uncontrollable durations. meet need, paper defines conditional simple temporal network uncertainty (cstnu) combines simple temporal constraints simple temporal network (stn) conditional nodes conditional simple temporal problem (cstp) contingent links simple temporal network uncertainty (stnu). notion dynamic controllability cstnu defined generalizes dynamic consistency ctp dynamic controllability stnu. paper also presents sound constraint-propagation rules dynamic controllability expected form backbone dynamic-controllability-checking algorithm cstnus.",4 "adversarial examples easily detected: bypassing ten detection methods. neural networks known vulnerable adversarial examples: inputs close natural inputs classified incorrectly. order better understand space adversarial examples, survey ten recent proposals designed detection compare efficacy. show defeated constructing new loss functions. conclude adversarial examples significantly harder detect previously appreciated, properties believed intrinsic adversarial examples fact not. finally, propose several simple guidelines evaluating future proposed defenses.",4 "web-scale training face identification. scaling machine learning methods large datasets attracted considerable attention recent years, thanks easy access ubiquitous sensing data web. study face recognition show three distinct properties surprising effects transferability deep convolutional networks (cnn): (1) bottleneck network serves important transfer learning regularizer, (2) contrast common wisdom, performance saturation may exist cnn's (as number training samples grows); propose solution alleviating replacing naive random subsampling training set bootstrapping process. moreover, (3) find link representation norm ability discriminate target domain, sheds lights networks represent faces. based discoveries, able improve face recognition accuracy widely used lfw benchmark, verification (1:1) identification (1:n) protocols, directly compare, first time, state art commercially-off-the-shelf system show sizable leap performance.",4 "general framework development cortex-like visual object recognition system: waves spikes, predictive coding universal dictionary features. study focused development cortex-like visual object recognition system. propose general framework, consists three hierarchical levels (modules). modules functionally correspond v1, v4 areas. bottom-up top-down connections hierarchical levels v4 employed. higher degree matching input preferred stimulus, shorter response time neuron. therefore information single stimulus distributed time transmitted waves spikes. reciprocal connections waves spikes implement predictive coding: initial hypothesis generated basis information delivered first wave spikes tested information carried consecutive waves. development considered extraction accumulation features v4 objects it. stored feature disposed, rarely activated. cause update feature repository. consequently, objects also updated. illustrates growing process dynamical change topological structures v4, connections areas.",4 "sldr-dl: framework sld-resolution deep learning. paper introduces sld-resolution technique based deep learning. technique enables neural networks learn old successful resolution processes use learnt experiences guide new resolution processes. implementation technique named sldr-dl. includes prolog library deep feedforward neural networks essential functions resolution. sldr-dl framework, users define logical rules form definite clauses teach neural networks use rules reasoning processes.",4 "single multiple illuminant estimation using convolutional neural networks. paper present method estimation color illuminant raw images. method includes convolutional neural network specially designed produce multiple local estimates. multiple illuminant detector determines whether local outputs network must aggregated single estimate. evaluated method standard datasets single multiple illuminants, obtaining lower estimation errors respect obtained general purpose methods state art.",4 tail inequality quadratic forms subgaussian random vectors. prove exponential probability tail inequality positive semidefinite quadratic forms subgaussian random vector. bound analogous one holds vector independent gaussian entries.,12 "prodige: prioritization disease genes multitask machine learning positive unlabeled examples. elucidating genetic basis human diseases central goal genetics molecular biology. traditional linkage analysis modern high-throughput techniques often provide long lists tens hundreds disease gene candidates, identification disease genes among candidates remains time-consuming expensive. efficient computational methods therefore needed prioritize genes within list candidates, exploiting wealth information available genes various databases. propose prodige, novel algorithm prioritization disease genes. prodige implements novel machine learning strategy based learning positive unlabeled examples, allows integrate various sources information genes, share information known disease genes across diseases, perform genome-wide searches new disease genes. experiments real data show prodige outperforms state-of-the-art methods prioritization genes human diseases.",16 "exact tensor completion using t-svd. paper focus problem completion multidimensional arrays (also referred tensors) limited sampling. approach based recently proposed tensor-singular value decomposition (t-svd) [1]. using factorization one derive notion tensor rank, referred tensor tubal rank, optimality properties similar matrix rank derived svd. shown [2] multidimensional data, panning video sequences exhibit low tensor tubal rank look problem completing data random sampling data cube. show solving convex optimization problem, minimizes tensor nuclear norm obtained convex relaxation tensor tubal rank, one guarantee recovery overwhelming probability long samples proportion degrees freedom t-svd observed. sense results order-wise optimal. conditions result holds similar incoherency conditions matrix completion, albeit define incoherency algebraic set-up t-svd. show performance algorithm real data sets compare existing approaches based tensor flattening tucker decomposition.",4 "summarizing decisions spoken meetings. paper addresses problem summarizing decisions spoken meetings: goal produce concise {\it decision abstract} meeting decision. explore compare token-level dialogue act-level automatic summarization methods using unsupervised supervised learning frameworks. supervised summarization setting, given true clusterings decision-related utterances, find token-level summaries employ discourse context approach upper bound decision abstracts derived directly dialogue acts. unsupervised summarization setting,we find summaries based unsupervised partitioning decision-related utterances perform comparably based partitions generated using supervised techniques (0.22 rouge-f1 using lda-based topic models vs. 0.23 using svms).",4 "segan: adversarial network multi-scale $l_1$ loss medical image segmentation. inspired classic generative adversarial networks (gan), propose novel end-to-end adversarial neural network, called segan, task medical image segmentation. since image segmentation requires dense, pixel-level labeling, single scalar real/fake output classic gan's discriminator may ineffective producing stable sufficient gradient feedback networks. instead, use fully convolutional neural network segmentor generate segmentation label maps, propose novel adversarial critic network multi-scale $l_1$ loss function force critic segmentor learn global local features capture long- short-range spatial relationships pixels. segan framework, segmentor critic networks trained alternating fashion min-max game: critic takes input pair images, (original_image $*$ predicted_label_map, original_image $*$ ground_truth_label_map), trained maximizing multi-scale loss function; segmentor trained gradients passed along critic, aim minimize multi-scale loss function. show segan framework effective stable segmentation task, leads better performance state-of-the-art u-net segmentation method. tested segan method using datasets miccai brats brain tumor segmentation challenge. extensive experimental results demonstrate effectiveness proposed segan multi-scale loss: brats 2013 segan gives performance comparable state-of-the-art whole tumor tumor core segmentation achieves better precision sensitivity gd-enhance tumor core segmentation; brats 2015 segan achieves better performance state-of-the-art dice score precision.",4 "end-to-end deep reinforcement learning lane keeping assist. reinforcement learning considered strong ai paradigm used teach machines interaction environment learning mistakes, yet successfully used automotive applications. recently revival interest topic, however, driven ability deep learning algorithms learn good representations environment. motivated google deepmind's successful demonstrations learning games breakout go, propose different methods autonomous driving using deep reinforcement learning. particular interest difficult pose autonomous driving supervised learning problem strong interaction environment including vehicles, pedestrians roadworks. relatively new area research autonomous driving, formulate two main categories algorithms: 1) discrete actions category, 2) continuous actions category. discrete actions category, deal deep q-network algorithm (dqn) continuous actions category, deal deep deterministic actor critic algorithm (ddac). addition that, also discover performance two categories open source car simulator racing called (torcs) stands open racing car simulator. simulation results demonstrate learning autonomous maneuvering scenario complex road curvatures simple interaction vehicles. finally, explain effect restricted conditions, put car learning phase, convergence time finishing learning phase.",19 "double sparse multi-frame image super resolution. large number image super resolution algorithms based sparse coding proposed, algorithms realize multi-frame super resolution. multi-frame super resolution based sparse coding, accurate image registration sparse coding required. previous study multi-frame super resolution based sparse coding firstly apply block matching image registration, followed sparse coding enhance image resolution. paper, two problems solved optimizing single objective function. results numerical experiments support effectiveness proposed approch.",4 "word learning infinite uncertainty. language learners must learn meanings many thousands words, despite words occurring complex environments infinitely many meanings might inferred learner word's true meaning. problem infinite referential uncertainty often attributed willard van orman quine. provide mathematical formalisation ideal cross-situational learner attempting learn infinite referential uncertainty, identify conditions word learning possible. quine's intuitions suggest, learning infinite uncertainty fact possible, provided learners means ranking candidate word meanings terms plausibility; furthermore, analysis shows ranking could fact exceedingly weak, implying constraints allow learners infer plausibility candidate word meanings could weak. approach lifts burden explanation `smart' word learning constraints learners, suggests programme research weak, unreliable, probabilistic constraints inference word meaning real word learners.",15 "defensive forecasting optimal prediction expert advice. method defensive forecasting applied problem prediction expert advice binary outcomes. turns defensive forecasting competitive aggregating algorithm also handles case ""second-guessing"" experts, whose advice depends learner's prediction; paper assumes dependence learner's prediction continuous.",4 "consistency auc pairwise optimization. auc (area roc curve) important evaluation criterion, popularly used many learning tasks class-imbalance learning, cost-sensitive learning, learning rank, etc. many learning approaches try optimize auc, owing non-convexity discontinuousness auc, almost approaches work surrogate loss functions. thus, consistency auc crucial; however, almost untouched before. paper, provide sufficient condition asymptotic consistency learning approaches based surrogate loss functions. based result, prove exponential loss logistic loss consistent auc, hinge loss inconsistent. then, derive $q$-norm hinge loss general hinge loss consistent auc. also derive consistent bounds exponential loss logistic loss, obtain consistent bounds many surrogate loss functions non-noise setting. further, disclose equivalence exponential surrogate loss auc exponential surrogate loss accuracy, one straightforward consequence finding adaboost rankboost equivalent.",4 "monocular visual odometry unmanned sea-surface vehicle. tackle problem localizing autonomous sea-surface vehicle river estuarine areas using monocular camera angular velocity input inertial sensor. method challenged two prominent drawbacks associated environment, typically present standard visual simultaneous localization mapping (slam) applications land (or air): a) scene depth varies significantly (from meters several kilometers) and, b) conjunction latter, exists ground plane provide features enough disparity based reliably detect motion. end, use imu orientation feedback order re-cast problem visual localization without mapping component, although map implicitly obtained camera pose estimates. find method produces reliable odometry estimates trajectories several hundred meters long water. compare visual odometry estimates gps based ground truth, interpolate trajectory splines common parameter obtain position error meters recovering optimal affine transformation two splines.",4 "sparse communication distributed gradient descent. make distributed stochastic gradient descent faster exchanging sparse updates instead dense updates. gradient updates positively skewed updates near zero, map 99% smallest updates (by absolute value) zero exchange sparse matrices. method combined quantization improve compression. explore different configurations apply neural machine translation mnist image classification tasks. configurations work mnist, whereas different configurations reduce convergence rate complex translation task. experiments show achieve 49% speed mnist 22% nmt without damaging final accuracy bleu.",4 "sequence-to-sequence generation spoken dialogue via deep syntax trees strings. present natural language generator based sequence-to-sequence approach trained produce natural language strings well deep syntax dependency trees input dialogue acts, use directly compare two-step generation separate sentence planning surface realization stages joint, one-step approach. able train setups successfully using little training data. joint setup offers better performance, surpassing state-of-the-art regards n-gram-based scores providing relevant outputs.",4 "fast approximate bayesian computation estimating parameters differential equations. approximate bayesian computation (abc) using sequential monte carlo method provides comprehensive platform parameter estimation, model selection sensitivity analysis differential equations. however, method, like monte carlo methods, incurs significant computational cost requires explicit numerical integration differential equations carry inference. paper propose novel method circumventing requirement explicit integration using derivatives gaussian processes smooth observations parameters estimated. evaluate methods using synthetic data generated model biological systems described ordinary delay differential equations. upon comparing performance method existing abc techniques, demonstrate produces comparably reliable parameter estimates significantly reduced execution time.",19 "formal measure machine intelligence. fundamental problem artificial intelligence nobody really knows intelligence is. problem especially acute need consider artificial systems significantly different humans. paper approach problem following way: take number well known informal definitions human intelligence given experts, extract essential features. mathematically formalised produce general measure intelligence arbitrary machines. believe measure formally captures concept machine intelligence broadest reasonable sense.",4 "text2shape: generating shapes natural language learning joint embeddings. present method generating colored 3d shapes natural language. end, first learn joint embeddings freeform text descriptions colored 3d shapes. model combines extends learning association metric learning approaches learn implicit cross-modal connections, produces joint representation captures many-to-many relations language physical properties 3d shapes color shape. evaluate approach, collect large dataset natural language descriptions physical 3d objects shapenet dataset. learned joint embedding demonstrate text-to-shape retrieval outperforms baseline approaches. using embeddings novel conditional wasserstein gan framework, generate colored 3d shapes text. method first connect natural language text realistic 3d objects exhibiting rich variations color, texture, shape detail. see video https://youtu.be/zrapvrdl13q",4 "practical method solving contextual bandit problems using decision trees. many efficient algorithms strong theoretical guarantees proposed contextual multi-armed bandit problem. however, applying algorithms practice difficult require domain expertise build appropriate features tune parameters. propose new method contextual bandit problem simple, practical, applied little domain expertise. algorithm relies decision trees model context-reward relationship. decision trees non-parametric, interpretable, work well without hand-crafted features. guide exploration-exploitation trade-off, use bootstrapping approach abstracts thompson sampling non-bayesian settings. also discuss several computational heuristics demonstrate performance method several datasets.",4 "entropy analysis word-length series natural language texts: effects text language genre. estimate $n$-gram entropies natural language texts word-length representation find sensitive text language genre. attribute sensitivity changes probability distribution lengths single words emphasize crucial role uniformity probabilities words length five ten. furthermore, comparison entropies shuffled data reveals impact word length correlations estimated $n$-gram entropies.",4 "high resolution face completion multiple controllable attributes via fully end-to-end progressive generative adversarial networks. present deep learning approach high resolution face completion multiple controllable attributes (e.g., male smiling) arbitrary masks. face completion entails understanding structural meaningfulness appearance consistency locally globally fill ""holes"" whose content appear elsewhere input image. challenging task difficulty level increasing significantly respect high resolution, complexity ""holes"" controllable attributes filled-in fragments. system addresses challenges learning fully end-to-end framework trains generative adversarial networks (gans) progressively low resolution high resolution conditional vectors encoding controllable attributes. design novel network architectures exploit information across multiple scales effectively efficiently. introduce new loss functions encouraging sharp completion. show system complete faces large structural appearance variations using single feed-forward pass computation mean inference time 0.007 seconds images 1024 x 1024 resolution. also perform pilot human study shows approach outperforms state-of-the-art face completion methods terms rank analysis. code released upon publication.",4 "adversarial feature learning. ability generative adversarial networks (gans) framework learn generative models mapping simple latent distributions arbitrarily complex data distributions demonstrated empirically, compelling results showing latent space generators captures semantic variation data distribution. intuitively, models trained predict semantic latent representations given data may serve useful feature representations auxiliary problems semantics relevant. however, existing form, gans means learning inverse mapping -- projecting data back latent space. propose bidirectional generative adversarial networks (bigans) means learning inverse mapping, demonstrate resulting learned feature representation useful auxiliary supervised discrimination tasks, competitive contemporary approaches unsupervised self-supervised feature learning.",4 "lstm networks data-aware remaining time prediction business process instances. predicting completion time business process instances would helpful aid managing processes service level agreement constraints. ability know advance trend running process instances would allow business managers react time, order prevent delays undesirable situations. however, making accurate forecasts easy: many factors may influence required time complete process instance. paper, propose approach based deep recurrent neural networks (specifically lstms) able exploit arbitrary information associated single events, order produce as-accurate-as-possible prediction completion time running instances. experiments real-world datasets confirm quality proposal.",4 "causal rule sets identifying subgroups enhanced treatment effect. introduce novel generative model interpretable subgroup analysis causal inference applications, causal rule sets (crs). crs model uses small set short rules capture subgroup average treatment effect elevated compared entire population. present bayesian framework learning causal rule set. bayesian framework consists prior favors simpler models bayesian logistic regression characterizes relation outcomes, attributes subgroup membership. find maximum posteriori models using discrete monte carlo steps joint solution space rules sets parameters. provide theoretically grounded heuristics bounding strategies improve search efficiency. experiments show search algorithm efficiently recover true underlying subgroup crs shows consistently competitive performance compared state-of-the-art baseline methods.",4 "neural paraphrase generation stacked residual lstm networks. paper, propose novel neural approach paraphrase generation. conventional para- phrase generation methods either leverage hand-written rules thesauri-based alignments, use statistical machine learning principles. best knowledge, work first explore deep learning models paraphrase generation. primary contribution stacked residual lstm network, add residual connections lstm layers. allows efficient training deep lstms. evaluate model state-of-the-art deep learning models three different datasets: ppdb, wikianswers mscoco. evaluation results demonstrate model outperforms sequence sequence, attention-based bi- directional lstm models bleu, meteor, ter embedding-based sentence similarity metric.",4 "optimal allocation strategies dark pool problem. study problem allocating stocks dark pools. propose analyze optimal approach allocations, continuous-valued allocations allowed. also propose modification case integer-valued allocations possible. extend previous work problem adversarial scenarios, also improving results iid setup. resulting algorithms efficient, perform well simulations stochastic adversarial inputs.",19 "active user authentication smartphones: challenge data set benchmark results. paper, automated user verification techniques smartphones investigated. unique non-commercial dataset, university maryland active authentication dataset 02 (umdaa-02) multi-modal user authentication research introduced. paper focuses three sensors - front camera, touch sensor location service providing general description modalities. benchmark results face detection, face verification, touch-based user identification location-based next-place prediction presented, indicate robust methods fine-tuned mobile platform needed achieve satisfactory verification accuracy. dataset made available research community promoting additional research.",4 "hybrid decision support system : application healthcare. many systems based knowledge, especially expert systems medical decision support developed. systems based production rules, cannot learn evolve updating them. addition, taking account several criteria induces exorbitant number rules injected system. becomes difficult translate medical knowledge support decision simple rule. moreover, reasoning based generic cases became classic even reduce range possible solutions. remedy that, propose approach based using multi-criteria decision guided case-based reasoning (cbr) approach.",4 "facial expression detection using patch-based eigen-face isomap networks. automated facial expression detection problem pose two primary challenges include variations expression facial occlusions (glasses, beard, mustache face covers). paper introduce novel automated patch creation technique masks particular region interest face, followed eigen-value decomposition patched faces generation isomaps detect underlying clustering patterns among faces. proposed masked eigen-face based isomap clustering technique achieves 75% sensitivity 66-73% accuracy classification faces occlusions smiling faces around 1 second per image. also, betweenness centrality, eigen centrality maximum information flow used network-based measures identify significant training faces expression classification tasks. proposed method used combination feature-based expression classification methods large data sets improving expression classification accuracies.",4 "matching-based selection incomplete lists decomposition multi-objective optimization. balance convergence diversity key issue evolutionary multi-objective optimization. recently proposed stable matching-based selection provides new perspective handle balance framework decomposition multi-objective optimization. particular, stable matching subproblems solutions, achieves equilibrium mutual preferences, implicitly strikes balance convergence diversity. nevertheless, original stable matching model high risk matching solution unfavorable subproblem finally leads imbalanced selection result. paper, propose adaptive two-level stable matching-based selection decomposition multi-objective optimization. specifically, borrowing idea stable matching incomplete lists, match solution one favorite subproblems restricting length preference list first-level stable matching. second-level stable matching, remaining subproblems thereafter matched favorite solutions according classic stable matching model. particular, develop adaptive mechanism automatically set length preference list solution according local competitiveness. performance proposed method validated compared several state-of-the-art evolutionary multi-objective optimization algorithms 62 benchmark problem instances. empirical results fully demonstrate competitive performance proposed method problems complicated pareto sets three objectives.",4 "existence finiteness conditions risk-sensitive planning: results conjectures. decision-theoretic planning risk-sensitive planning objectives important building autonomous agents decision-support systems real-world applications. however, line research largely ignored artificial intelligence operations research communities since planning risk-sensitive planning objectives complicated planning risk-neutral planning objectives. remedy situation, derive conditions guarantee optimal expected utilities total plan-execution reward exist finite fully observable markov decision process models non-linear utility functions. case markov decision process models positive negative rewards, results hold stationary policies only, conjecture generalized non stationary policies.",4 "belief revision: critique. examine carefully rationale underlying approaches belief change taken literature, highlight view methodological problems. argue study belief change carefully, must quite explicit ``ontology'' scenario underlying belief change process. something missing previous work, focus postulates. analysis shows must pay particular attention two issues often taken granted: first model agent's epistemic state. (do use set beliefs, richer structure, ordering worlds? use set beliefs, language beliefs expressed?) show even postulates called ``beyond controversy'' unreasonable agent's beliefs include beliefs epistemic state well external world. second status observations. (are observations known true, believed? latter case, firm belief?) issues regarding status observations arise particularly consider iterated belief revision, must confront possibility revising p not-p.",4 "algebras measurements: logical structure quantum mechanics. quantum physics, measurement represented projection closed subspace hilbert space. study algebras operators abstract algebra projections closed subspaces hilbert space. properties operators justified epistemological grounds. commutation measurements central topic interest. classical logical systems may viewed measurement algebras measurements commute. keywords: quantum measurements, measurement algebras, quantum logic. pacs: 02.10.-v.",18 "study topological descriptors analysis 3d surface texture. methods computational topology becoming popular computer vision shown improve state-of-the-art several tasks. paper, investigate applicability topological descriptors context 3d surface analysis classification different surface textures. present comprehensive study topological descriptors, investigate robustness expressiveness compare state-of-the-art methods including convolutional neural networks (cnns). results show class-specific information reflected well topological descriptors. investigated descriptors directly compete non-topological descriptors capture complementary information. consequence improve state-of-the-art combined non-topological descriptors.",4 "group factor analysis. factor analysis provides linear factors describe relationships individual variables data set. extend classical formulation linear factors describe relationships groups variables, group represents either set related variables data set. model also naturally extends canonical correlation analysis two sets, way flexible previous extensions. solution formulated variational inference latent variable model structural sparsity, consists two hierarchical levels: higher level models relationships groups, whereas lower models observed variables given higher level. show resulting solution solves group factor analysis problem accurately, outperforming alternative factor analysis based solutions well straightforward implementations group factor analysis. method demonstrated two life science data sets, one brain activation systems biology, illustrating applicability analysis different types high-dimensional data sources.",19 "construction non-convex polynomial loss functions training binary classifier quantum annealing. quantum annealing heuristic quantum algorithm exploits quantum resources minimize objective function embedded energy levels programmable physical system. take advantage potential quantum advantage, one needs able map problem interest native hardware reasonably low overhead. experimental considerations constrain objective function take form low degree pubo (polynomial unconstrained binary optimization), employ non-convex loss functions polynomial functions margin. show loss functions robust label noise provide clear advantage convex methods. loss functions may also useful classical approaches compile regularized risk expressions evaluated constant time respect number training examples.",4 "alternating optimization method based nonnegative matrix factorizations deep neural networks. backpropagation algorithm calculating gradients widely used computation weights deep neural networks (dnns). method requires derivatives objective functions difficulties finding appropriate parameters learning rate. paper, propose novel approach computing weight matrices fully-connected dnns using two types semi-nonnegative matrix factorizations (semi-nmfs). method, optimization processes performed calculating weight matrices alternately, backpropagation (bp) used. also present method calculate stacked autoencoder using nmf. output results autoencoder used pre-training data dnns. experimental results show method using three types nmfs attains similar error rates conventional dnns bp.",4 "logical n-and gate molecular turing machine. boolean algebra, known logical function corresponds negation conjunction --nand-- universal sense logical function built based it. property makes essential modern digital electronics computer processor design. here, design molecular turing machine computes nand function binary strings arbitrary length. purpose, perform mathematical abstraction kind operations done double-stranded dna molecule, well presenting molecular encoding input symbols machine.",4 "stochastic pooling regularization deep convolutional neural networks. introduce simple effective method regularizing large convolutional neural networks. replace conventional deterministic pooling operations stochastic procedure, randomly picking activation within pooling region according multinomial distribution, given activities within pooling region. approach hyper-parameter free combined regularization approaches, dropout data augmentation. achieve state-of-the-art performance four image datasets, relative approaches utilize data augmentation.",4 "local optima learning bayesian networks. paper proposes evaluates k-greedy equivalence search algorithm (kes) learning bayesian networks (bns) complete data. main characteristic kes allows trade-off greediness randomness, thus exploring different good local optima. greediness set maximum, kes corresponds greedy equivalence search algorithm (ges). greediness kept minimum, prove mild assumptions kes asymptotically returns inclusion optimal bn nonzero probability. experimental results synthetic real data reported showing kes often finds better local optima ges. moreover, use kes experimentally confirm number different local optima often huge.",4 "modified mel filter bank compute mfcc subsampled speech. mel frequency cepstral coefficients (mfccs) popularly used speech features speech speaker recognition applications. work, propose modified mel filter bank extract mfccs subsampled speech. also propose stronger metric effectively captures correlation mfccs original speech mfcc resampled speech. found proposed method filter bank construction performs distinguishably well gives recognition performance resampled speech close recognition accuracies original speech.",4 "efficient fpga implementation mri image filtering tumor characterization using xilinx system generator. paper presents efficient architecture various image filtering algorithms tumor characterization using xilinx system generator (xsg). architecture offers alternative graphical user interface combines matlab, simulink xsg explores important aspects concerned hardware implementation. performance architecture implemented spartan-3e starter kit (xc3s500e-fg320) exceeds similar greater resources architectures. proposed architecture reduces resources available target device 50%.",4 "using neural networks improve classical operating system fingerprinting techniques. present remote operating system detection inference problem: given set observations (the target host responses set tests), want infer os type probably generated observations. classical techniques used perform analysis present several limitations. improve analysis, developed tools using neural networks statistics tools. present two working modules: one uses dce-rpc endpoints distinguish windows versions, another uses nmap signatures distinguish different version windows, linux, solaris, openbsd, freebsd netbsd systems. explain details topology inner workings neural networks used, fine tuning parameters. finally show positive experimental results.",4 "l1-regularized distributed optimization: communication-efficient primal-dual framework. despite importance sparsity many large-scale applications, methods distributed optimization sparsity-inducing objectives. paper, present communication-efficient framework l1-regularized optimization distributed environment. viewing classical objectives general primal-dual setting, develop new class methods efficiently distributed applied common sparsity-inducing models, lasso, sparse logistic regression, elastic net-regularized problems. provide theoretical convergence guarantees framework, demonstrate efficiency flexibility thorough experimental comparison amazon ec2. proposed framework yields speedups 50x compared current state-of-the-art methods distributed l1-regularized optimization.",4 "deep learning medical image analysis. report describes research activities hasso plattner institute summarizes ph.d. plan several novels, end-to-end trainable approaches analyzing medical images using deep learning algorithm. report, example, explore different novel methods based deep learning brain abnormality detection, recognition, segmentation. report prepared doctoral consortium aime-2017 conference.",4 "dynamic high resolution deformable articulated tracking. last several years seen significant progress using depth cameras tracking articulated objects human bodies, hands, robotic manipulators. approaches focus tracking skeletal parameters fixed shape model, makes insufficient applications require accurate estimates deformable object surfaces. overcome limitation, present 3d model-based tracking system articulated deformable objects. system able track human body pose high resolution surface contours real time using commodity depth sensor gpu hardware. implement joint optimization skeleton account changes pose, vertices high resolution mesh track subject's shape. experimental results show able capture dynamic sub-centimeter surface detail folds wrinkles clothing. also show shape estimation aids kinematic pose estimation providing accurate target match point cloud. end result highly accurate spatiotemporal semantic information well suited physical human robot interaction well virtual augmented reality systems.",4 "egocentric height estimation. egocentric, first-person vision became popular recent years emerge wearable technology, different exocentric (third-person) vision distinguishable ways, one camera wearer generally visible video frames. recent work done action object recognition egocentric videos, well work biometric extraction first-person videos. height estimation useful feature soft-biometrics object tracking. here, propose method estimating height egocentric camera without calibration reference points. used traditional computer vision approaches deep learning order determine visual cues results best height estimation. here, introduce framework inspired two stream networks comprising two convolutional neural networks, one based spatial information, one based information given optical flow frame. given egocentric video input framework, model yields height estimate output. also incorporate late fusion learn combination temporal spatial cues. comparing model methods used baselines, achieve height estimates videos mean average error 14.04 cm range 103 cm data, classification accuracy relative height (tall, medium short) 93.75% chance level 33%.",4 "exploiting qualitative knowledge learning conditional probabilities bayesian networks. algorithms learning conditional probabilities bayesian networks hidden variables typically operate within high-dimensional search space yield locally optimal solutions. one way limiting search space avoiding local optima impose qualitative constraints based background knowledge concerning domain. present method integrating formal statements qualitative constraints two learning algorithms, apn em. experiments synthetic data, method yielded networks satisfied constraints almost perfectly. accuracy learned networks consistently superior corresponding networks learned without constraints. exploitation qualitative constraints therefore appears promising way increase interpretability accuracy learned bayesian networks known structure.",4 "fca - approach leach protocol wireless sensor networks using fuzzy logic. order gather information efficiently, wireless sensor networks partitioned clusters. proposed clustering algorithms consider location base station. situation causes hot spots problem multi-hop wireless sensor networks. paper, propose fuzzy clustering algorithm (fca) aims prolong lifetime wireless sensor networks. fca adjusts cluster-head radius considering residual energy distance base station parameters sensor nodes. helps decreasing intra-cluster work sensor nodes closer base station lower battery level. utilize fuzzy logic handling uncertainties cluster-head radius estimation. compare algorithm leach according first node dies, half nodes alive energy-efficiency metrics. simulation results show fca performs better algorithms cases. therefore, proposed algorithm stable energy-efficient clustering algorithm.",4 "combining multi-level contexts superpixel using convolutional neural networks perform natural scene labeling. modern deep learning algorithms triggered various image segmentation approaches. however deal pixel based segmentation. however, superpixels provide certain degree contextual information reducing computation cost. approach, performed superpixel level semantic segmentation considering 3 various levels neighbours semantic contexts. furthermore, enlisted number ensemble approaches like max-voting weighted-average. also used dempster-shafer theory uncertainty analyze confusion among various classes. method proved superior number different modern approaches dataset.",4 "implicit segmentation kannada characters offline handwriting recognition using hidden markov models. describe method classification handwritten kannada characters using hidden markov models (hmms). kannada script agglutinative, simple shapes concatenated horizontally form character. results large number characters making task classification difficult. character segmentation plays significant role reducing number classes. explicit segmentation techniques suffer overlapping shapes present, common case handwritten text. use hmms take advantage agglutinative nature kannada script, allows us perform implicit segmentation characters along recognition. experiments performed chars74k dataset consists 657 handwritten characters collected across multiple users. gradient-based features extracted individual characters used train character hmms. use implicit segmentation technique character level resulted improvement around 10%. system also outperformed existing system tested dataset around 16%. analysis based learning curves showed increasing training data could result better accuracy. accordingly, collected additional data obtained improvement 4% 6 additional samples.",4 "decision aids adversarial planning military operations: algorithms, tools, turing-test-like experimental validation. use intelligent decision aids help alleviate challenges planning complex operations. describe integrated algorithms, tool capable translating high-level concept tactical military operation fully detailed, actionable plan, producing automatically (or human guidance) plans realistic degree detail human-like quality. tight interleaving several algorithms -- planning, adversary estimates, scheduling, routing, attrition consumption estimates -- comprise computational approach tool. although originally developed army large-unit operations, technology generic also applies number domains, particularly critical situations requiring detailed planning within constrained period time. paper, focus particularly engineering tradeoffs design tool. experimental evaluation, reminiscent turing test, tool's performance compared favorably human planners.",4 "resource aware design deep convolutional-recurrent neural network speech recognition audio-visual sensor fusion. today's automatic speech recognition systems rely acoustic signals often perform well noisy conditions. performing multi-modal speech recognition - processing acoustic speech signals lip-reading video simultaneously - significantly enhances performance systems, especially noisy environments. work presents design audio-visual system automated speech recognition, taking memory computation requirements account. first, long-short-term-memory neural network acoustic speech recognition designed. second, convolutional neural networks used model lip-reading features. combined lstm network model temporal dependencies perform automatic lip-reading video. finally, acoustic-speech visual lip-reading networks combined process acoustic visual features simultaneously. attention mechanism ensures performance model noisy environments. system evaluated tcd-timit 'lipspeaker' dataset audio-visual phoneme recognition clean audio additive white noise snr 0db. achieves 75.70% 58.55% phoneme accuracy respectively, 14 percentage points better state-of-the-art noise levels.",4 "verbal chunk extraction french using limited resources. way extracting french verbal chunks, inflected infinitive, explored tested effective corpus. declarative morphological local grammar rules specifying chunks simple contextual structures used, relying limited lexical information simple heuristic/statistic properties obtained restricted corpora. specific goals, architecture formalism system, linguistic information relies obtained results effective corpus presented.",4 "kernelized deep convolutional neural network describing complex images. impressive capability capture visual content, deep convolutional neural networks (cnn) demon- strated promising performance various vision-based ap- plications, classification, recognition, objec- detection. however, due intrinsic structure design cnn, images complex content, achieves lim- ited capability invariance translation, rotation, re-sizing changes, strongly emphasized s- cenario content-based image retrieval. paper, address problem, proposed new kernelized deep convolutional neural network. first discuss motiva- tion experimental study demonstrate sensitivi- ty global cnn feature basic geometric trans- formations. then, propose represent visual content approximate invariance geometric trans- formations kernelized perspective. extract cnn features detected object-like patches aggregate patch-level cnn features form vectorial repre- sentation fisher vector model. effectiveness proposed algorithm demonstrated image search application three benchmark datasets.",4 "280 birds one stone: inducing multilingual taxonomies wikipedia using character-level classification. propose simple, yet effective, approach towards inducing multilingual taxonomies wikipedia. given english taxonomy, approach leverages interlanguage links wikipedia followed character-level classifiers induce high-precision, high-coverage taxonomies languages. experiments, demonstrate approach significantly outperforms state-of-the-art, heuristics-heavy approaches six languages. consequence work, release presumably largest accurate multilingual taxonomic resource spanning 280 languages.",4 "overcoming vanishing gradient problem plain recurrent networks. plain recurrent networks greatly suffer vanishing gradient problem gated neural networks (gnns) long-short term memory (lstm) gated recurrent unit (gru) deliver promising results many sequence learning tasks sophisticated network designs. paper shows address problem plain recurrent network analyzing gating mechanisms gnns. propose novel network called recurrent identity network (rin) allows plain recurrent network overcome vanishing gradient problem training deep models without use gates. compare model irnns lstms multiple sequence modeling benchmarks. rins demonstrate competitive performance converge faster tasks. notably, small rin models produce 12%--67% higher accuracy sequential permuted mnist datasets reach state-of-the-art performance babi question answering dataset.",4 "wordfence: text detection natural images border awareness. recent years, text recognition achieved remarkable success recognizing scanned document text. however, word recognition natural images still open problem, generally requires time consuming post-processing steps. present novel architecture individual word detection scene images based semantic segmentation. contributions twofold: concept wordfence, detects border areas surrounding individual word novel pixelwise weighted softmax loss function penalizes background emphasizes small text regions. wordfence ensures word detected individually, new loss function provides strong training signal text word border localization. proposed technique avoids intensive post-processing, producing end-to-end word detection system. achieve superior localization recall common benchmark datasets - 92% recall icdar11 icdar13 63% recall svt. furthermore, end-to-end word recognition system achieves state-of-the-art 86% f-score icdar13.",4 "comprehensive implementation conceptual spaces. highly influential framework conceptual spaces provides geometric way representing knowledge. instances represented points concepts represented regions (potentially) high-dimensional space. based recent formalization, present comprehensive implementation conceptual spaces framework capable representing concepts inter-domain correlations, also offers variety operations concepts.",4 "dynamic stochastic approximation multi-stage stochastic optimization. paper, consider multi-stage stochastic optimization problems convex objectives conic constraints stage. present new stochastic first-order method, namely dynamic stochastic approximation (dsa) algorithm, solving types stochastic optimization problems. show dsa achieve optimal ${\cal o}(1/\epsilon^4)$ rate convergence terms total number required scenarios applied three-stage stochastic optimization problem. show rate convergence improved ${\cal o}(1/\epsilon^2)$ objective function strongly convex. also discuss variants dsa solving general multi-stage stochastic optimization problems number stages $t > 3$. developed dsa algorithms need go scenario tree order compute $\epsilon$-solution multi-stage stochastic optimization problem. best knowledge, first time stochastic approximation type methods generalized multi-stage stochastic optimization $t \ge 3$.",12 "annotating object instances polygon-rnn. propose approach semi-automatic annotation object instances. current methods treat object segmentation pixel-labeling problem, cast polygon prediction task, mimicking current datasets annotated. particular, approach takes input image crop sequentially produces vertices polygon outlining object. allows human annotator interfere time correct vertex needed, producing accurate segmentation desired annotator. show approach speeds annotation process factor 4.7 across classes cityscapes, achieving 78.4% agreement iou original ground-truth, matching typical agreement human annotators. cars, speed-up factor 7.3 agreement 82.2%. show generalization capabilities approach unseen datasets.",4 "quantized convolutional neural networks mobile devices. recently, convolutional neural networks (cnn) demonstrated impressive performance various computer vision tasks. however, high performance hardware typically indispensable application cnn models due high computation complexity, prohibits extensions. paper, propose efficient framework, namely quantized cnn, simultaneously speed-up computation reduce storage memory overhead cnn models. filter kernels convolutional layers weighting matrices fully-connected layers quantized, aiming minimizing estimation error layer's response. extensive experiments ilsvrc-12 benchmark demonstrate 4~6x speed-up 15~20x compression merely one percentage loss classification accuracy. quantized cnn model, even mobile devices accurately classify images within one second.",4 "found good match: keep searching? - accuracy performance iris matching using 1-to-first search. iris recognition used many applications around world, enrollment sizes large one billion persons india's aadhaar program. large enrollment sizes require special optimizations order achieve fast database searches. one optimization used operational scenarios 1:first search. approach, instead scanning entire database, search terminated first sufficiently good match found. saves time, ignores potentially better matches may exist unexamined portion enrollments. least one prominent successful border-crossing program used approach nearly decade, order allow users fast ""token-free"" search. work investigates search accuracy 1:first compares traditional 1:n search. several different scenarios considered trying emulate real environments best possible: range enrollment sizes, closed- open-set configurations, two iris matchers, different permutations galleries. results confirm expected accuracy degradation using 1:first search, also allow us identify acceptable working parameters significant search time reduction achieved, maintaining accuracy similar 1:n search.",4 "top-down saliency detection driven visual classification. paper presents approach top-down saliency detection guided visual classification tasks. first learn compute visual saliency specific visual task accomplished, opposed state-of-the-art methods assess saliency merely bottom-up principles. afterwards, investigate extent visual saliency support visual classification nontrivial cases. achieve this, propose salclassnet, cnn framework consisting two networks jointly trained: a) first one computing top-down saliency maps input images, b) second one exploiting computed saliency maps visual classification. test approach, collected dataset eye-gaze maps, using tobii t60 eye tracker, asking several subjects look images stanford dogs dataset, objective distinguishing dog breeds. performance analysis dataset saliency bench-marking datasets, poet, showed salclassnet out-performs state-of-the-art saliency detectors, salnet salicon. finally, analyzed performance salclassnet fine-grained recognition task found generalizes better existing visual classifiers. achieved results, thus, demonstrate 1) conditioning saliency detectors object classes reaches state-of-the-art performance, 2) providing explicitly top-down saliency maps visual classifiers enhances classification accuracy.",4 "using distributional semantic vector space knowledge base reasoning uncertain conditions. inherent inflexibility incompleteness commonsense knowledge bases (kb) limited usefulness. describe system called displacer performing kb queries extended analogical capabilities word2vec distributional semantic vector space (dsvs). allows system answer queries information contained original kb form. performing analogous queries semantically related terms mapping answers back context original query using displacement vectors, able give approximate answers many questions which, posed kb alone, would return results. also show hand-curated knowledge kb used increase accuracy dsvs solving analogy problems. ways, kb dsvs make other's weaknesses.",4 "multi-issue negotiation deadlines. paper studies bilateral multi-issue negotiation self-interested autonomous agents. now, number different procedures used process; three main ones package deal procedure issues bundled discussed together, simultaneous procedure issues discussed simultaneously independently other, sequential procedure issues discussed one another. since yields different outcome, key problem decide one use circumstances. specifically, consider question model agents time constraints (in form deadlines discount factors) information uncertainty (in agents know opponents utility function). model, consider issues independent interdependent determine equilibria case procedure. doing, show package deal fact optimal procedure party. go show that, although package deal may computationally complex two procedures, generates pareto optimal outcomes (unlike two), similar earliest latest possible times agreement simultaneous procedure (which better sequential procedure), (like two procedures) generates unique outcome certain conditions (which define).",4 "rationally biased learning. human perception decision biases grounded form rationality? return camp hunting gathering. see grass moving. know probability snake grass. cross grass - risk bitten snake - make long, hence costly, detour? based storyline, consider rational decision maker maximizing expected discounted utility learning. show optimal behavior displays three biases: status quo, salience, overestimation small probabilities. biases product rational behavior.",4 coercive region-level registration multi-modal images. propose coercive approach simultaneously register segment multi-modal images share similar spatial structure. registration done region level facilitate data fusion avoiding need interpolation. algorithm performs alternating minimization objective function informed statistical models pixel values different modalities. hypothesis tests developed determine whether refine segmentations splitting regions. demonstrate approach significantly better performance state-of-the-art registration segmentation methods microscopy images.,4 "large scale language modeling automatic speech recognition. large language models proven quite beneficial variety automatic speech recognition tasks google. summarize results voice search youtube speech transcription tasks highlight impact one expect increasing amount training data, size language model estimated data. depending task, availability amount training data used, language model size amount work care put integrating lattice rescoring step observe reductions word error rate 6% 10% relative, systems wide range operating points 17% 52% word error rate.",4 "fast convnets using group-wise brain damage. revisit idea brain damage, i.e. pruning coefficients neural network, suggest brain damage modified used speedup convolutional layers. approach uses fact many efficient implementations reduce generalized convolutions matrix multiplications. suggested brain damage process prunes convolutional kernel tensor group-wise fashion adding group-sparsity regularization standard training process. group-wise pruning, convolutions reduced multiplications thinned dense matrices, leads speedup. comparison alexnet, method achieves competitive performance.",4 "spatial features multi-font/multi-size kannada numerals vowels recognition. paper presents multi-font/multi-size kannada numerals vowels recognition based spatial features. directional spatial features viz stroke density, stroke length number stokes image employed potential features characterize printed kannada numerals vowels. based features 1100 numerals 1400 vowels classified multi-class support vector machines (svm). proposed system achieves recognition accuracy 98.45% 90.64% numerals vowels respectively.",4 "quantized memory-augmented neural networks. memory-augmented neural networks (manns) refer class neural network models equipped external memory (such neural turing machines memory networks). neural networks outperform conventional recurrent neural networks (rnns) terms learning long-term dependency, allowing solve intriguing ai tasks would otherwise hard address. paper concerns problem quantizing manns. quantization known effective deploy deep models embedded systems limited resources. furthermore, quantization substantially reduce energy consumption inference procedure. benefits justify recent developments quantized multi layer perceptrons, convolutional networks, rnns. however, prior work reported successful quantization manns. in-depth analysis presented reveals various challenges appear quantization networks. without addressing properly, quantized manns would normally suffer excessive quantization error leads degraded performance. paper, identify memory addressing (specifically, content-based addressing) main reason performance degradation propose robust quantization method manns address challenge. experiments, achieved computation-energy gain 22x 8-bit fixed-point binary quantization compared floating-point implementation. measured babi dataset, resulting model, named quantized mann (q-mann), improved error rate 46% 30% 8-bit fixed-point binary quantization, respectively, compared mann quantized using conventional techniques.",4 "fusion hyperspectral panchromatic images using spectral uumixing results. hyperspectral imaging, due providing high spectral resolution images, one important tools remote sensing field. technological restrictions hyperspectral sensors limited spatial resolution. hand panchromatic image better spatial resolution. combining information together provide better understanding target scene. spectral unmixing mixed pixels hyperspectral images results spectral signature abundance fractions endmembers gives information location mixed pixel. paper used spectral unmixing results hyperspectral images segmentation results panchromatic image data fusion. proposed method applied simulated data using avris indian pines datasets. results show method effectively combine information hyperspectral panchromatic images.",4 "survey credit card fraud detection techniques: data technique oriented perspective. credit card plays important rule today's economy. becomes unavoidable part household, business global activities. although using credit cards provides enormous benefits used carefully responsibly,significant credit financial damages may caused fraudulent activities. many techniques proposed confront growth credit card fraud. however, techniques goal avoiding credit card fraud; one drawbacks, advantages characteristics. paper, investigating difficulties credit card fraud detection, seek review state art credit card fraud detection techniques, data sets evaluation criteria.the advantages disadvantages fraud detection methods enumerated compared.furthermore, classification mentioned techniques two main fraud detection approaches, namely, misuses (supervised) anomaly detection (unsupervised) presented. again, classification techniques proposed based capability process numerical categorical data sets. different data sets used literature described grouped real synthesized data effective common attributes extracted usage.moreover, evaluation employed criterions literature collected discussed.consequently, open issues credit card fraud detection explained guidelines new researchers.",4 "answer sequence learning neural networks answer selection community question answering. paper, answer selection problem community question answering (cqa) regarded answer sequence labeling task, novel approach proposed based recurrent architecture problem. approach applies convolution neural networks (cnns) learning joint representation question-answer pair firstly, uses joint representation input long short-term memory (lstm) learn answer sequence question labeling matching quality answer. experiments conducted semeval 2015 cqa dataset shows effectiveness approach.",4 "role word length semantic topology. topological argument presented concering structure semantic space, based negative correlation polysemy word length. resulting graph structure applied modeling free-recall experiments, resulting predictions comparative values recall probabilities. associative recall found favor longer words whereas sequential recall found favor shorter words. data peers experiments lohnas et al. (2015) healey kahana (2016) confirm predictons, correlation coefficients $r_{seq}= -0.17$ $r_{ass}= +0.17$. argument applied predicting global properties list recall, leads novel explanation word-length effect based optimization retrieval strategies.",16 "learning-based image reconstruction via parallel proximal algorithm. past decade, sparsity-driven regularization led advancement image reconstruction algorithms. traditionally, regularizers rely analytical models sparsity (e.g. total variation (tv)). however, recent methods increasingly centered around data-driven arguments inspired deep learning. letter, propose generalize tv regularization replacing l1-penalty alternative prior trainable. specifically, method learns prior via extending recently proposed fast parallel proximal algorithm (fppa) incorporate data-adaptive proximal operators. proposed framework require additional inner iterations evaluating proximal mappings corresponding learned prior. moreover, formalism ensures training reconstruction processes share algorithmic structure, making end-to-end implementation intuitive. example, demonstrate algorithm problem deconvolution fluorescence microscope.",4 "survey calibration methods optical see-through head-mounted displays. optical see-through head-mounted displays (ost hmds) major output medium augmented reality, seen significant growth popularity usage among general public due growing release consumer-oriented models, microsoft hololens. unlike virtual reality headsets, ost hmds inherently support addition computer-generated graphics directly light path user's eyes view physical world. augmented virtual reality systems, physical position ost hmd typically determined external embedded 6-degree-of-freedom tracking system. however, order properly render virtual objects, perceived spatially aligned physical environment, also necessary accurately measure position user's eyes within tracking system's coordinate frame. 20 years, researchers proposed various calibration methods determine needed eye position. however, date, comprehensive overview procedures requirements. hence, paper surveys field calibration methods ost hmds. specifically, provides insights fundamentals calibration techniques, presents overview manual automatic approaches, well evaluation methods metrics. finally, also identifies opportunities future research. % relative tracking coordinate system, and, hence, position 3d space.",4 "integrating human-provided information belief state representation using dynamic factorization. partially observed environments, useful human provide robot declarative information augments direct sensory observations. instance, given robot search-and-rescue mission, human operator might suggest locations interest. provide representation robot's internal knowledge supports efficient combination raw sensory information high-level declarative information presented formal language. computational efficiency achieved dynamically selecting appropriate factoring belief state, combining aspects belief correlated information separating not. strategy works open domains, set possible objects known advance, provides significant improvements inference time, leading efficient planning complex partially observable tasks. validate approach experimentally two open-domain planning problems: 2d discrete gridworld task 3d continuous cooking task.",4 "decision-making support system based know-how. research results described concerned with: - developing domain modeling method tools provide design implementation decision-making support systems computer integrated manufacturing; - building decision-making support system based know-how software environment. research funded nedo, japan.",4 computational geometry column 38. recent results curve reconstruction described.,4 "denet: scalable real-time object detection directed sparse sampling. define object detection imagery problem estimating large extremely sparse bounding box dependent probability distribution. subsequently identify sparse distribution estimation scheme, directed sparse sampling, employ single end-to-end cnn based detection model. methodology extends formalizes previous state-of-the-art detection models additional emphasis high evaluation rates reduced manual engineering. introduce two novelties, corner based region-of-interest estimator deconvolution based cnn model. resulting model scene adaptive, require manually defined reference bounding boxes produces highly competitive results mscoco, pascal voc 2007 pascal voc 2012 real-time evaluation rates. analysis suggests model performs particularly well finegrained object localization desirable. argue advantage stems significantly larger set available regions-of-interest relative methods. source-code available from: https://github.com/lachlants/denet",4 "minimax optimal algorithms unconstrained linear optimization. design analyze minimax-optimal algorithms online linear optimization games player's choice unconstrained. player strives minimize regret, difference loss loss post-hoc benchmark strategy. standard benchmark loss best strategy chosen bounded comparator set. comparison set adversary's gradients satisfy l_infinity bounds, give value game closed form prove approaches sqrt(2t/pi) -> infinity. interesting algorithms result consider soft constraints comparator, rather restricting bounded set. warmup, analyze game quadratic penalty. value game exactly t/2, value achieved perhaps simplest online algorithm all: unprojected gradient descent constant learning rate. derive minimax-optimal algorithm much softer penalty function. algorithm achieves good bounds standard notion regret comparator point, without needing specify comparator set advance. value game converges sqrt{e} ->infinity; give closed-form exact value function t. resulting algorithm natural unconstrained investment betting scenarios, since guarantees worst constant loss, allowing exponential reward ""easy"" adversary.",4 "rotational unit memory. concepts unitary evolution matrices associative memory boosted field recurrent neural networks (rnn) state-of-the-art performance variety sequential tasks. however, rnn still limited capacity manipulate long-term memory. bypass weakness successful applications rnn use external techniques attention mechanisms. paper propose novel rnn model unifies state-of-the-art approaches: rotational unit memory (rum). core rum rotational operation, is, naturally, unitary matrix, providing architectures power learn long-term dependencies overcoming vanishing exploding gradients problem. moreover, rotational unit also serves associative memory. evaluate model synthetic memorization, question answering language modeling tasks. rum learns copying memory task completely improves state-of-the-art result recall task. rum's performance babi question answering task comparable models attention mechanism. also improve state-of-the-art result 1.189 bits-per-character (bpc) loss character level penn treebank (ptb) task, signify applications rum real-world sequential data. universality construction, core rnn, establishes rum promising approach language modeling, speech recognition machine translation.",4 "deep convolutional neural networks predominant instrument recognition polyphonic music. identifying musical instruments polyphonic music recordings challenging important problem field music information retrieval. enables music search instrument, helps recognize musical genres, make music transcription easier accurate. paper, present convolutional neural network framework predominant instrument recognition real-world polyphonic music. train network fixed-length music excerpts single-labeled predominant instrument estimate arbitrary number predominant instruments audio signal variable length. obtain audio-excerpt-wise result, aggregate multiple outputs sliding windows test audio. so, investigated two different aggregation methods: one takes average instrument takes instrument-wise sum followed normalization. addition, conducted extensive experiments several important factors affect performance, including analysis window size, identification threshold, activation functions neural networks find optimal set parameters. using dataset 10k audio excerpts 11 instruments evaluation, found convolutional neural networks robust conventional methods exploit spectral features source separation support vector machines. experimental results showed proposed convolutional network architecture obtained f1 measure 0.602 micro 0.503 macro, respectively, achieving 19.6% 16.4% performance improvement compared state-of-the-art algorithms.",4 "topic modeling short texts incorporating word embeddings. inferring topics overwhelming amount short texts becomes critical challenging task many content analysis tasks, content charactering, user interest profiling, emerging topic detecting. existing methods probabilistic latent semantic analysis (plsa) latent dirichlet allocation (lda) cannot solve prob- lem well since limited word co-occurrence information available short texts. paper studies incorporate external word correlation knowledge short texts improve coherence topic modeling. based recent results word embeddings learn se- mantically representations words large corpus, introduce novel method, embedding-based topic model (etm), learn latent topics short texts. etm solves problem limited word co-occurrence information aggregating short texts long pseudo- texts, also utilizes markov random field regularized model gives correlated words better chance put topic. experiments real-world datasets validate effectiveness model comparing state-of-the-art models.",4 "modelling probability density markov sources. paper introduces objective function seeks minimise average total number bits required encode joint state layers markov source. type encoder may applied problem optimising bottom-up (recognition model) top-down (generative model) connections multilayer neural network, unifies several previous results optimisation multilayer neural networks.",4 "feature selection parallel technique remotely sensed imagery classification. remote sensing research focusing feature selection long attracted attention remote sensing community feature selection prerequisite image processing various applications. different feature selection methods proposed improve classification accuracy. vary basic search techniques clonal selections, various optimal criteria investigated. recently, methods using dependence-based measures attracted much attention due ability deal high dimensional datasets. however, methods based cramers v test, performance issues large datasets. paper, propose parallel approach improve performance. evaluate approach hyper-spectral high spatial resolution images compare proposed methods centralized version preliminary results. results promising.",4 "minimizing inter-subject variability fnirs based brain computer interfaces via multiple-kernel support vector learning. brain signal variability measurements obtained different subjects different sessions significantly deteriorates accuracy brain-computer interface (bci) systems. moreover variabilities, also known inter-subject inter-session variabilities, require lengthy calibration sessions bci system used. furthermore, calibration session repeated subject independently use bci due inter-session variability. study, present algorithm order minimize above-mentioned variabilities overcome time-consuming usually error-prone calibration time. algorithm based linear programming support-vector machines extensions multiple kernel learning framework. tackle inter-subject -session variability feature spaces classifiers. done incorporating subject- session-specific feature spaces much richer feature spaces set optimal decision boundaries. decision boundary represents subject- session specific spatio-temporal variabilities neural signals. consequently, single classifier multiple feature spaces generalize well new unseen test patterns even without calibration steps. demonstrate classifiers maintain good performances even presence large degree bci variability. present study analyzes bci variability related oxy-hemoglobin neural signals measured using functional near-infrared spectroscopy.",19 "sampling optimization space measures: langevin dynamics composite optimization problem. study sampling optimization space measures. focus gradient flow-based optimization langevin dynamics case study. investigate source bias unadjusted langevin algorithm (ula) discrete time, consider remove reduce bias. point difficulty heat flow exactly solvable, neither forward backward method implementable general, except gaussian data. propose symmetrized langevin algorithm (sla), smaller bias ula, price implementing proximal gradient step space. show sla fact consistent gaussian target measure, whereas ula not. also illustrate various algorithms explicitly gaussian target measure, including gradient descent, proximal gradient, forward-backward, show consistent.",12 "real-time 3d shape micro-details. motivated growing demand interactive environments, propose accurate real-time 3d shape reconstruction technique. provide reliable 3d reconstruction still challenging task dealing real-world applications, integrate several components including (i) photometric stereo (ps), (ii) perspective cook-torrance reflectance model enables ps deal broad range possible real-world object reflections, (iii) realistic lightening situation, (iv) recurrent optimization network (ron) finally (v) heuristic dijkstra gaussian mean curvature (dgmc) initialization approach. demonstrate potential benefits hybrid model providing 3d shape highly-detailed information micro-prints first time. real-world images taken mobile phone camera simple setup consumer-level equipment. addition, complementary synthetic experiments confirm beneficial properties novel method superiority state-of-the-art approaches.",4 "overdispersed black-box variational inference. introduce overdispersed black-box variational inference, method reduce variance monte carlo estimator gradient black-box variational inference. instead taking samples variational distribution, use importance sampling take samples overdispersed distribution exponential family variational approximation. approach general since readily applied exponential family distribution, typical choice variational approximation. run experiments two non-conjugate probabilistic models show method effectively reduces variance, overhead introduced computation proposal parameters importance weights negligible. find overdispersed importance sampling scheme provides lower variance black-box variational inference, even latter uses twice number samples. results faster convergence black-box inference procedure.",19 "capturing localized image artifacts cnn-based hyper-image representation. training deep cnns capture localized image artifacts relatively small dataset challenging task. enough images hand, one hope deep cnn characterizes localized artifacts entire data effect output. however, smaller datasets, deep cnns may overfit shallow ones find hard capture local artifacts. thus image-based small-data applications first train framework collection patches (instead entire image) better learn representation localized artifacts. output obtained averaging patch-level results. approach ignores spatial correlation among patches various patch locations affect output. also fails cases patches mainly contribute image label. combat scenarios, develop notion hyper-image representations. cnn two stages. first stage trained patches. second stage utilizes last layer representation developed first stage form hyper-image, used train second stage. show approach able develop better mapping image output. analyze additional properties approach show effectiveness one synthetic two real-world vision tasks - no-reference image quality estimation image tampering detection - performance improvement existing strong baselines.",4 approximate principal direction trees. introduce new spatial data structure high dimensional data called \emph{approximate principal direction tree} (apd tree) adapts intrinsic dimension data. algorithm ensures vector-quantization accuracy similar computationally-expensive pca trees similar time-complexity lower-accuracy rp trees. apd trees use small number power-method iterations find splitting planes recursively partitioning data. provide natural trade-off running-time accuracy achieved rp pca trees. theoretical results establish a) strong performance guarantees regardless convergence rate power-method b) $o(\log d)$ iterations suffice establish guarantee pca trees intrinsic dimension $d$. demonstrate trade-off efficacy data structure cpu gpu.,4 "geometrical interpretation shannon's entropy based born rule. paper analyze discrete probability distributions probabilities particular outcomes experiment (microstates) represented ratio natural numbers (in words, probabilities represented digital numbers finite representation length). introduce several results based recently proposed joystick probability selector, represents geometrical interpretation probability based born rule. terms generic space generic dimension discrete distribution, well as, effective dimension going introduced. shown simple geometric representation lead optimal code length coding sequence signals. then, give new, geometrical, interpretation shannon entropy discrete distribution. suggest shannon entropy represents logarithm effective dimension distribution. proposed geometrical interpretation shannon entropy used prove information inequalities elementary way.",4 "learning games rademacher observations losses. recently shown supervised learning popular logistic loss equivalent optimizing exponential loss sufficient statistics class: rademacher observations (rados). first show unexpected equivalence actually generalized example / rado losses, necessary sufficient conditions equivalence, exemplified four losses bear popular names various fields: exponential (boosting), mean-variance (finance), linear hinge (on-line learning), relu (deep learning), unhinged (statistics). second, show generalization unveils surprising new connection regularized learning, particular sufficient condition regularizing loss examples equivalent regularizing rados (with minkowski sums) equivalent rado loss. brings simple powerful rado-based learning algorithms sparsity-controlling regularization, exemplify boosting algorithm regularized exponential rado-loss, formally boosts four types regularization, including popular ridge lasso, recently coined slope --- obtain first proven boosting algorithm last regularization. first contribution equivalence rado example-based losses, omega-r.adaboost~appears efficient proxy boost regularized logistic loss examples using whichever four regularizers. experiments display regularization consistently improves performances rado-based learning, may challenge beat state art example-based learning even learning small sets rados. finally, connect regularization differential privacy, display tiny budgets afforded big domains beating (protected) example-based learning.",4 "approximate bayesian long short-term memory algorithm outlier detection. long short-term memory networks trained gradient descent back-propagation received great success various applications. however, point estimation weights networks prone over-fitting problems lacks important uncertainty information associated estimation. however, exact bayesian neural network methods intractable non-applicable real-world applications. study, propose approximate estimation weights uncertainty using ensemble kalman filter, easily scalable large number weights. furthermore, optimize covariance noise distribution ensemble update step using maximum likelihood estimation. assess proposed algorithm, apply outlier detection five real-world events retrieved twitter platform.",4 "analysis first prototype universal intelligence tests: evaluating comparing ai algorithms humans. today, available methods assess ai systems focused using empirical techniques measure performance algorithms specific tasks (e.g., playing chess, solving mazes land helicopter). however, methods appropriate want evaluate general intelligence ai and, even less, compare human intelligence. anynt project designed new method evaluation tries assess ai systems using well known computational notions problems general possible. new method serves assess general intelligence (which allows us learn solve new kind problem face) evaluate performance set specific tasks. method focuses measuring intelligence algorithms, also assess intelligent system (human beings, animals, ai, aliens?,...), letting us place results scale and, therefore, able compare them. new approach allow us (in future) evaluate compare kind intelligent system known even build/find, artificial biological. master thesis aims ensuring new method provides consistent results evaluating ai algorithms, done design implementation prototypes universal intelligence tests application different intelligent systems (ai algorithms humans beings). study analyze whether results obtained two different intelligent systems properly located scale propose changes refinements prototypes order to, future, able achieve truly universal intelligence test.",4 "sentiment analysis financial news headlines using training dataset augmentation. paper discusses approach taken uwaterloo team arrive solution fine-grained sentiment analysis problem posed task 5 semeval 2017. paper describes document vectorization sentiment score prediction techniques used, well design implementation decisions taken building system task. system uses text vectorization models, n-gram, tf-idf paragraph embeddings, coupled regression model variants predict sentiment scores. amongst methods examined, unigrams bigrams coupled simple linear regression obtained best baseline accuracy. paper also explores data augmentation methods supplement training dataset. system designed subtask 2 (news statements headlines).",4 "deep learning good steganalysis tool embedding key reused different images, even cover source-mismatch. since boss competition, 2010, steganalysis approaches use learning methodology involving two steps: feature extraction, rich models (rm), image representation, use ensemble classifier (ec) learning step. 2015, qian et al. shown use deep learning approach jointly learns computes features, promising steganalysis. paper, follow-up study qian et al., show that, due intrinsic joint minimization, results obtained convolutional neural network (cnn) fully connected neural network (fnn), well parameterized, surpass conventional use rm ec. first, numerous experiments conducted order find best "" shape "" cnn. second, experiments carried clairvoyant scenario order compare cnn fnn rm ec. results show 16% reduction classification error cnn fnn. third, experiments also performed cover-source mismatch setting. results show cnn fnn naturally robust mismatch problem. addition experiments, provide discussions internal mechanisms cnn, weave links previously stated ideas, order understand impressive results obtained.",4 "sparsity-based defense adversarial attacks linear classifiers. deep neural networks represent state art machine learning growing number fields, including vision, speech natural language processing. however, recent work raises important questions robustness architectures, showing possible induce classification errors tiny, almost imperceptible, perturbations. vulnerability ""adversarial attacks"", ""adversarial examples"", conjectured due excessive linearity deep networks. paper, study phenomenon setting linear classifier, show possible exploit sparsity natural data combat $\ell_{\infty}$-bounded adversarial perturbations. specifically, demonstrate efficacy sparsifying front end via ensemble averaged analysis, experimental results mnist handwritten digit database. best knowledge, first work show sparsity provides theoretically rigorous framework defense adversarial attacks.",19 "hierarchical internal representation spectral features deep convolutional networks trained eeg decoding. recently, increasing interest research interpretability machine learning models, example transform internally represent eeg signals brain-computer interface (bci) applications. help understand limits model may improved, addition possibly provide insight data itself. schirrmeister et al. (2017) recently reported promising results eeg decoding deep convolutional neural networks (convnets) trained end-to-end manner and, causal visualization approach, showed learn use spectral amplitude changes input. study, investigate convnets represent spectral features sequence intermediate stages network. show higher sensitivity eeg phase features earlier stages higher sensitivity eeg amplitude features later stages. intriguingly, observed specialization individual stages network classical eeg frequency bands alpha, beta, high gamma. furthermore, find first evidence particularly last convolutional layer, network learns detect complex oscillatory patterns beyond spectral phase amplitude, reminiscent representation complex visual features later layers convnets computer vision tasks. findings thus provide insights convnets hierarchically represent spectral eeg features intermediate layers suggest convnets exploit might help better understand compositional structure eeg time series.",4 "combinatorial multi-armed bandits filtered feedback. motivated problems search detection present solution combinatorial multi-armed bandit (cmab) problem heavy-tailed reward distributions new class feedback, filtered semibandit feedback. cmab problem agent pulls combination arms set $\{1,...,k\}$ round, generating random outcomes probability distributions associated arms receiving overall reward. semibandit feedback assumed random outcomes generated observed. filtered semibandit feedback allows outcomes observed sampled second distribution conditioned initial random outcomes. feedback mechanism valuable allows cmab methods applied sequential search detection problems combinatorial actions made, true rewards (number objects interest appearing round) observed, rather filtered reward (the number objects searcher successfully finds, must definition less number appear). present upper confidence bound type algorithm, robust-f-cucb, associated regret bound order $\mathcal{o}(\ln(n))$ balance exploration exploitation face filtering reward heavy tailed reward distributions.",4 "progressive representation adaptation weakly supervised object localization. address problem weakly supervised object localization image-level annotations available training object detectors. numerous methods proposed tackle problem mining object proposals. however, substantial amount noise object proposals causes ambiguities learning discriminative object models. approaches sensitive model initialization often converge undesirable local minimum solutions. paper, propose overcome drawbacks progressive representation adaptation two main steps: 1) classification adaptation 2) detection adaptation. classification adaptation, transfer pre-trained network multi-label classification task recognizing presence certain object image. classification adaptation step, network learns discriminative representations specific object categories interest. detection adaptation, mine class-specific object proposals exploiting two scoring strategies based adapted classification network. class-specific proposal mining helps remove substantial noise background clutter potential confusion similar objects. refine proposals using multiple instance learning segmentation cues. using refined object bounding boxes, fine-tune layer classification network obtain fully adapted detection network. present detailed experimental validation pascal voc ilsvrc datasets. experimental results demonstrate progressive representation adaptation algorithm performs favorably state-of-the-art methods.",4 "algorithms generating ordered solutions explicit and/or structures. present algorithms generating alternative solutions explicit acyclic and/or structures non-decreasing order cost. proposed algorithms use best first search technique report solutions using implicit representation ordered cost. paper, present two versions search algorithm -- (a) initial version best first search algorithm, asg, may present one solution generating ordered solutions, (b) another version, lasg, avoids construction duplicate solutions. actual solutions reconstructed quickly implicit compact representation used. applied methods test domains, synthetic others based well known problems including search space 5-peg tower hanoi problem, matrix-chain multiplication problem problem finding secondary structure rna. experimental results show efficacy proposed algorithms existing approach. proposed algorithms potential use various domains ranging knowledge based frameworks service composition, and/or structure widely used representing problems.",4 "training quantized nets: deeper understanding. currently, deep neural networks deployed low-power portable devices first training full-precision model using powerful hardware, deriving corresponding low-precision model efficient inference systems. however, training models directly coarsely quantized weights key step towards learning embedded platforms limited computing resources, memory capacity, power consumption. numerous recent publications studied methods training quantized networks, studies mostly empirical. work, investigate training methods quantized neural networks theoretical viewpoint. first explore accuracy guarantees training methods convexity assumptions. look behavior algorithms non-convex problems, show training algorithms exploit high-precision representations important greedy search phase purely quantized training methods lack, explains difficulty training using low-precision arithmetic.",4 "adaptive strategy superpixel-based region-growing image segmentation. work presents region-growing image segmentation approach based superpixel decomposition. initial contour-constrained over-segmentation input image, image segmentation achieved iteratively merging similar superpixels regions. approach raises two key issues: (1) compute similarity superpixels order perform accurate merging (2) order superpixels must merged together. perspective, firstly introduce robust adaptive multi-scale superpixel similarity region comparisons made content common border level. secondly, propose global merging strategy efficiently guide region merging process. strategy uses adpative merging criterion ensure best region aggregations given highest priorities. allows reach final segmentation consistent regions strong boundary adherence. perform experiments bsds500 image dataset highlight extent method compares favorably well-known image segmentation algorithms. obtained results demonstrate promising potential proposed approach.",4 "counterexample guided abstraction refinement algorithm propositional circumscription. circumscription representative example nonmonotonic reasoning inference technique. circumscription often studied first order theories, propositional version also subject extensive research, shown equivalent extended closed world assumption (ecwa). moreover, entailment propositional circumscription well-known example decision problem second level polynomial hierarchy. paper proposes new boolean satisfiability (sat)-based algorithm entailment propositional circumscription explores relationship propositional circumscription minimal models. new algorithm inspired ideas commonly used sat-based model checking, namely counterexample guided abstraction refinement. addition, new algorithm refined compute theory closure generalized close world assumption (gcwa). experimental results show new algorithm solve problem instances solutions unable solve.",4 "improving agreement disagreement identification online discussions socially-tuned sentiment lexicon. study problem agreement disagreement detection online discussions. isotonic conditional random fields (isotonic crf) based sequential model proposed make predictions sentence- segment-level. automatically construct socially-tuned lexicon bootstrapped existing general-purpose sentiment lexicons improve performance. evaluate agreement disagreement tagging model two disparate online discussion corpora -- wikipedia talk pages online debates. model shown outperform state-of-the-art approaches datasets. example, isotonic crf model achieves f1 scores 0.74 0.67 agreement disagreement detection, linear chain crf obtains 0.58 0.56 discussions wikipedia talk pages.",4 "deep learning algorithm one-step contour aware nuclei segmentation histopathological images. paper addresses task nuclei segmentation high-resolution histopathological images. propose auto- matic end-to-end deep neural network algorithm segmenta- tion individual nuclei. nucleus-boundary model introduced predict nuclei boundaries simultaneously using fully convolutional neural network. given color normalized image, model directly outputs estimated nuclei map boundary map. simple, fast parameter-free post-processing procedure performed estimated nuclei map produce final segmented nuclei. overlapped patch extraction assembling method also designed seamless prediction nuclei large whole-slide images. also show effectiveness data augmentation methods nuclei segmentation task. experiments showed method outperforms prior state-of-the- art methods. moreover, efficient one 1000x1000 image segmented less 5 seconds. makes possible precisely segment whole-slide image acceptable time",4 "bootstrapping lexical choice via multiple-sequence alignment. important component generation system mapping dictionary, lexicon elementary semantic expressions corresponding natural language realizations. typically, labor-intensive knowledge-based methods used construct dictionary. instead propose acquire automatically via novel multiple-pass algorithm employing multiple-sequence alignment, technique commonly used bioinformatics. crucially, method leverages latent information contained multi-parallel corpora -- datasets supply several verbalizations corresponding semantics rather one. used techniques generate natural language versions computer-generated mathematical proofs, good results per-component overall-output basis. example, evaluations involving dozen human judges, system produced output whose readability faithfulness semantic input rivaled traditional generation system.",4 "use lose it: selective memory forgetting perpetual learning machine. recent article described new type deep neural network - perpetual learning machine (plm) - capable learning 'on fly' like brain existing state perpetual stochastic gradient descent (psgd). here, simulating process practice, demonstrate selective memory selective forgetting introduce statistical recall biases psgd. frequently recalled memories remembered, whilst memories recalled rarely forgotten. results 'use lose it' stimulus driven memory process similar human memory.",4 "max-margin nonparametric latent feature models link prediction. present max-margin nonparametric latent feature model, unites ideas max-margin learning bayesian nonparametrics discover discriminative latent features link prediction automatically infer unknown latent social dimension. minimizing hinge-loss using linear expectation operator, perform posterior inference efficiently without dealing highly nonlinear link likelihood function; using fully-bayesian formulation, avoid tuning regularization constants. experimental results real datasets appear demonstrate benefits inherited max-margin learning fully-bayesian nonparametric inference.",4 "kernel risk-sensitive loss: definition, properties application robust adaptive filtering. nonlinear similarity measures defined kernel space, correntropy, extract higher-order statistics data offer potentially significant performance improvement linear counterparts especially non-gaussian signal processing machine learning. work, propose new similarity measure kernel space, called kernel risk-sensitive loss (krsl), provide important properties. apply krsl adaptive filtering investigate robustness, develop mkrsl algorithm analyze mean square convergence performance. compared correntropy, krsl offer efficient performance surface, thereby enabling gradient based method achieve faster convergence speed higher accuracy still maintaining robustness outliers. theoretical analysis results superior performance new algorithm confirmed simulation.",19 "challenge multi-camera tracking. multi-camera tracking quite different single camera tracking, faces new technology system architecture challenges. analyzing corresponding characteristics disadvantages existing algorithms, problems multi-camera tracking summarized new directions future work also generalized.",4 "elu network total variation image denoising. paper, propose novel convolutional neural network (cnn) image denoising, uses exponential linear unit (elu) activation function. investigate suitability analyzing elu's connection trainable nonlinear reaction diffusion model (tnrd) residual denoising. hand, batch normalization (bn) indispensable residual denoising convergence purpose. however, direct stacking bn elu degrades performance cnn. mitigate issue, design innovative combination activation layer normalization layer exploit leverage elu network, discuss corresponding rationale. moreover, inspired fact minimizing total variation (tv) applied image denoising, propose tv regularized l2 loss evaluate training effect iterations. finally, conduct extensive experiments, showing model outperforms recent popular approaches gaussian denoising specific randomized noise levels gray color images.",4 "exploiting feature class relationships video categorization regularized deep neural networks. paper, study challenging problem categorizing videos according high-level semantics existence particular human action complex event. although extensive efforts devoted recent years, existing works combined multiple video features using simple fusion strategies neglected utilization inter-class semantic relationships. paper proposes novel unified framework jointly exploits feature relationships class relationships improved categorization performance. specifically, two types relationships estimated utilized rigorously imposing regularizations learning process deep neural network (dnn). regularized dnn (rdnn) efficiently realized using gpu-based implementation affordable training cost. arming dnn better capability harnessing feature class relationships, proposed rdnn suitable modeling video semantics. extensive experimental evaluations, show rdnn produces superior performance several state-of-the-art approaches. well-known hollywood2 columbia consumer video benchmarks, obtain competitive results: 66.9\% 73.5\% respectively terms mean average precision. addition, substantially evaluate rdnn stimulate future research large scale video categorization, collect release new benchmark dataset, called fcvid, contains 91,223 internet videos 239 manually annotated categories.",4 "clustering multidimensional data pso based algorithm. data clustering recognized data analysis method data mining whereas k-means well known partitional clustering method, possessing pleasant features. observed that, k-means partitional clustering techniques suffer several limitations initial cluster centre selection, preknowledge number clusters, dead unit problem, multiple cluster membership premature convergence local optima. several optimization methods proposed literature order solve clustering limitations, swarm intelligence (si) achieved remarkable position concerned area. particle swarm optimization (pso) popular si technique one favorite areas researchers. paper, present brief overview pso applicability variants solve clustering challenges. also, propose advanced pso algorithm named subtractive clustering based boundary restricted adaptive particle swarm optimization (sc-br-apso) algorithm clustering multidimensional data. comparison purpose, studied analyzed various algorithms k-means, pso, k-means-pso, hybrid subtractive + pso, brapso, proposed algorithm nine different datasets. motivation behind proposing sc-br-apso algorithm deal multidimensional data clustering, minimum error rate maximum convergence rate.",4 "ian: individual aggregation network person search. person search real-world scenarios new challenging computer version task many meaningful applications. challenge task mainly comes from: (1) unavailable bounding boxes pedestrians model needs search person whole gallery images; (2) huge variance visual appearance particular person owing varying poses, lighting conditions, occlusions. address two critical issues modern person search applications, propose novel individual aggregation network (ian) accurately localize persons learning minimize intra-person feature variations. ian built upon state-of-the-art object detection framework, i.e., faster r-cnn, high-quality region proposals pedestrians produced online manner. addition, relieve negative effect caused varying visual appearances individual, ian introduces novel center loss increase intra-class compactness feature representations. engaged center loss encourages persons identity similar feature characteristics. extensive experimental results two benchmarks, i.e., cuhk-sysu prw, well demonstrate superiority proposed model. particular, ian achieves 77.23% map 80.45% top-1 accuracy cuhk-sysu, outperform state-of-the-art 1.7% 1.85%, respectively.",4 "value alignment, fair play, rights service robots. ethics safety research artificial intelligence increasingly framed terms ""alignment"" human values interests. argue turing's call ""fair play machines"" early often overlooked contribution alignment literature. turing's appeal fair play suggests need correct human behavior accommodate machines, surprising inversion value alignment treated today. reflections ""fair play"" motivate novel interpretation turing's notorious ""imitation game"" condition intelligence instead value alignment: machine demonstrates minimal degree alignment (with norms conversation, instance) go undetected interrogated human. carefully distinguish interpretation moral turing test, motivated principle fair play, instead depends imitation human moral behavior. finally, consider framework fair play used situate debate robot rights within alignment literature. argue extending rights service robots operating public spaces ""fair"" precisely sense encourages alignment interests humans machines.",4 "online unsupervised feature learning visual tracking. feature encoding respect over-complete dictionary learned unsupervised methods, followed spatial pyramid pooling, linear classification, exhibited powerful strength various vision applications. propose use feature learning pipeline visual tracking. tracking implemented using tracking-by-detection resulted framework simple yet effective. first, online dictionary learning used build dictionary, captures appearance changes tracking target well background changes. given test image window, extract local image patches local patch encoded respect dictionary. encoded features pooled spatial pyramid form aggregated feature vector. finally, simple linear classifier trained features. experiments show proposed powerful---albeit simple---tracker, outperforms state-of-the-art tracking methods tested. moreover, evaluate performance different dictionary learning feature encoding methods proposed tracking framework, analyse impact component tracking scenario. also demonstrate flexibility feature learning plugging hare et al.'s tracking method. outcome is, knowledge, best tracker ever reported, facilitates advantages feature learning structured output prediction.",4 "pros cons gan evaluation measures. generative models, particular generative adverserial networks (gans), received lot attention recently. number gan variants proposed utilized many applications. despite large strides terms theoretical progress, evaluating comparing gans remains daunting task. several measures introduced, yet, consensus measure best captures strengths limitations models used fair model comparison. areas computer vision machine learning, critical settle one good measures steer progress field. paper, review critically discuss 19 quantitative 4 qualitative measures evaluating generative models particular emphasis gan-derived models.",4 "discrete network dynamics. part 1: operator theory. operator algebra implementation markov chain monte carlo algorithms simulating markov random fields proposed. allows dynamics networks whose nodes discrete state spaces specified action update operator composed creation annihilation operators. formulation discrete network dynamics properties similar quantum field theory bosons, allows reuse many conceptual theoretical structures qft. equilibrium behaviour one generalised mrfs adaptive cluster expansion network (acenet) shown equivalent, provides way unifying two theories.",4 "decision uncertainty diagnosis. paper describes incorporation uncertainty diagnostic reasoning based set covering model reggia et. al. extended artificial intelligence dichotomy deep compiled (shallow, surface) knowledge based diagnosis may viewed generic form compiled end spectrum. major undercurrent advocating need strong underlying model integrated set support tools carrying model order deal uncertainty.",4 "neural sequence model training via $α$-divergence minimization. propose new neural sequence model training method objective function defined $\alpha$-divergence. demonstrate objective function generalizes maximum-likelihood (ml)-based reinforcement learning (rl)-based objective functions special cases (i.e., ml corresponds $\alpha \to 0$ rl $\alpha \to1$). also show gradient objective function considered mixture ml- rl-based objective gradients. experimental results machine translation task show minimizing objective function $\alpha > 0$ outperforms $\alpha \to 0$, corresponds ml-based methods.",19 "topic stability noisy sources. topic modelling techniques lda recently applied speech transcripts ocr output. corpora may contain noisy erroneous texts may undermine topic stability. therefore, important know well topic modelling algorithm perform applied noisy data. paper show different types textual noise diverse effects stability different topic models. observations, propose guidelines text corpus generation, focus automatic speech transcription. also suggest topic model selection methods noisy corpora.",4 "target tracking real time surveillance cameras videos. security concerns kept increasing, important everyone keep property safe thefts destruction. need surveillance techniques also increasing. system developed detect motion video. system developed real time applications using techniques background subtraction frame differencing. system, motion detected webcam real time video. background subtraction frames differencing method used detect moving target. background subtraction method, current frame subtracted referenced frame threshold applied. difference greater threshold considered pixel moving object, otherwise considered background pixel. similarly, two frames difference method takes difference two continuous frames. resultant difference frame thresholded amount difference pixels calculated.",4 "unsupervised spike sorting based discriminative subspace learning. spike sorting fundamental preprocessing step many neuroscience studies rely analysis spike trains. paper, present two unsupervised spike sorting algorithms based discriminative subspace learning. first algorithm simultaneously learns discriminative feature subspace performs clustering. uses histogram features discriminative projection detect number neurons. second algorithm performs hierarchical divisive clustering learns discriminative 1-dimensional subspace clustering level hierarchy achieving almost unimodal distribution subspace. algorithms tested synthetic in-vivo data, compared two widely used spike sorting methods. comparative results demonstrate spike sorting methods achieve substantially higher accuracy lower dimensional feature space, highly robust noise. moreover, provide significantly better cluster separability learned subspace subspace obtained principal component analysis wavelet transform.",4 "clustering-based quantisation pde-based image compression. finding optimal data inpainting key problem context partial differential equation based image compression. data yields accurate reconstruction real-valued. thus, quantisation models mandatory allow efficient encoding. also understood challenging data clustering problems. although clustering approaches well suited kind compression codecs, works actually consider them. pixel global impact reconstruction optimal data locations strongly correlated corresponding colour values. facts make hard predict feature works best. paper discuss quantisation strategies based popular methods k-means. lead central question kind feature vectors best suited image compression. end consider choices pixel values, histogram colour map. findings show number colours reduced significantly without impacting reconstruction quality. surprisingly, benefits directly translate good image compression performance. gains compression ratio lost due increased storage costs. suggests integral evaluate clustering both, reconstruction error final file size.",4 "stochastic variance reduction methods policy evaluation. policy evaluation crucial step many reinforcement-learning procedures, estimates value function predicts states' long-term value given policy. paper, focus policy evaluation linear function approximation fixed dataset. first transform empirical policy evaluation problem (quadratic) convex-concave saddle point problem, present primal-dual batch gradient method, well two stochastic variance reduction methods solving problem. algorithms scale linearly sample size feature dimension. moreover, achieve linear convergence even saddle-point problem strong concavity dual variables strong convexity primal variables. numerical experiments benchmark problems demonstrate effectiveness methods.",4 "mixture counting cnns: adaptive integration cnns specialized specific appearance crowd counting. paper proposes crowd counting method. crowd counting difficult large appearance changes target caused density scale changes. conventional crowd counting methods generally utilize one predictor (e,g., regression multi-class classifier). however, one predictor count targets large appearance changes well. paper, propose predict number targets using multiple cnns specialized specific appearance, cnns adaptively selected according appearance test image. integrating selected cnns, proposed method robustness large appearance changes. experiments, confirm proposed method count crowd lower counting error cnn integration cnns fixed weights. moreover, confirm predictor automatically specialized specific appearance.",4 "improved ant colony system sequential ordering problem. rare performance one metaheuristic algorithm improved incorporating ideas taken another. article present simulated annealing (sa) used improve efficiency ant colony system (acs) enhanced acs solving sequential ordering problem (sop). moreover, show ideas applied improve convergence dedicated local search, i.e. sop-3-exchange algorithm. statistical analysis proposed algorithms terms finding suitable parameter values quality generated solutions presented based series computational experiments conducted sop instances well-known tsplib soplib2006 repositories. proposed acs-sa eacs-sa algorithms often generate solutions better quality acs eacs, respectively. moreover, eacs-sa algorithm combined proposed sop-3-exchange-sa local search able find 10 new best solutions sop instances soplib2006 repository, thus improving state-of-the-art results known literature. overall, best known improved solutions found 41 48 cases.",4 "metric-free natural gradient joint-training boltzmann machines. paper introduces metric-free natural gradient (mfng) algorithm training boltzmann machines. similar spirit hessian-free method martens [8], algorithm belongs family truncated newton methods exploits efficient matrix-vector product avoid explicitely storing natural gradient metric $l$. metric shown expected second derivative log-partition function (under model distribution), equivalently, variance vector partial derivatives energy function. evaluate method task joint-training 3-layer deep boltzmann machine show mfng indeed faster per-epoch convergence compared stochastic maximum likelihood centering, though wall-clock performance currently competitive.",4 "tiny descriptors image retrieval unsupervised triplet hashing. typical image retrieval pipeline starts comparison global descriptors large database find short list candidate matches. good image descriptor key retrieval pipeline reconcile two contradictory requirements: providing recall rates high possible compact possible fast matching. following recent successes deep convolutional neural networks (dcnn) large scale image classification, descriptors extracted dcnns increasingly used place traditional hand crafted descriptors fisher vectors (fv) better retrieval performances. nevertheless, dimensionality typical dcnn descriptor --extracted either visual feature pyramid fully-connected layers-- remains quite high several thousands scalar values. paper, propose unsupervised triplet hashing (uth), fully unsupervised method compute extremely compact binary hashes --in 32-256 bits range-- high-dimensional global descriptors. uth consists two successive deep learning steps. first, stacked restricted boltzmann machines (srbm), type unsupervised deep neural nets, used learn binary embedding functions able bring descriptor size desired bitrate. srbms typically able ensure high compression rate expense loosing desirable metric properties original dcnn descriptor space. then, triplet networks, rank learning scheme based weight sharing nets used fine-tune binary embedding functions retain much possible useful metric properties original space. thorough empirical evaluation conducted multiple publicly available dataset using dcnn descriptors shows method able significantly outperform state-of-the-art unsupervised schemes target bit range.",4 "large-scale optimization algorithms sparse conditional gaussian graphical models. paper addresses problem scalable optimization l1-regularized conditional gaussian graphical models. conditional gaussian graphical models generalize well-known gaussian graphical models conditional distributions model output network influenced conditioning input variables. highly scalable optimization methods exist sparse gaussian graphical model estimation, state-of-the-art methods conditional gaussian graphical models efficient enough importantly, fail due memory constraints large problems. paper, propose new optimization procedure based newton method efficiently iterates two sub-problems, leading drastic improvement computation time compared previous methods. extend method scale large problems memory constraints, using block coordinate descent limit memory usage achieving fast convergence. using synthetic genomic data, show methods solve one million dimensional problems high accuracy little day single machine.",19 "adversarial perturbations deep neural networks malware classification. deep neural networks, like many machine learning models, recently shown lack robustness adversarially crafted inputs. inputs derived regular inputs minor yet carefully selected perturbations deceive machine learning models desired misclassifications. existing work emerging field largely specific domain image classification, since high-entropy images conveniently manipulated without changing images' overall visual appearance. yet, remains unclear attacks translate security-sensitive applications malware detection - may pose significant challenges sample generation arguably grave consequences failure. paper, show construct highly-effective adversarial sample crafting attacks neural networks used malware classifiers. application domain malware classification introduces additional constraints adversarial sample crafting problem compared computer vision domain: (i) continuous, differentiable input domains replaced discrete, often binary inputs; (ii) loose condition leaving visual appearance unchanged replaced requiring equivalent functional behavior. demonstrate feasibility attacks many different instances malware classifiers trained using drebin android malware data set. furthermore evaluate extent potential defensive mechanisms adversarial crafting leveraged setting malware classification. feature reduction prove positive impact, distillation re-training adversarially crafted samples show promising results.",4 "multiscale edge detection parametric shape modeling boundary delineation optoacoustic images. article, present novel scheme segmenting image boundary (with background) optoacoustic small animal vivo imaging systems. method utilizes multiscale edge detection algorithm generate binary edge map. scale dependent morphological operation employed clean spurious edges. thereafter, ellipse fitted edge map constrained parametric transformations iterative goodness fit calculations. method delimits tissue edges curve fitting model, shown high levels accuracy. thus, method enables segmentation optoacoutic images minimal human intervention, eliminating need scale selection multiscale processing seed point determination contour mapping.",15 "control crack propagate along specified path feasibly?. controllable crack propagation (ccp) strategy suggested. well known crack always leads failure crossing critical domain engineering structure. therefore, ccp method proposed control crack propagate along specified path, away critical domain. complete strategy, two optimization methods engaged. firstly, back propagation neural network (bpnn) assisted particle swarm optimization (pso) suggested. method, improve efficiency ccp, bpnn used build metamodel instead forward evaluation. secondly, popular pso used. considering optimization iteration time consuming process, efficient reanalysis based extended finite element methods (x-fem) used substitute complete x-fem solver calculate crack propagation path. moreover, adaptive subdomain partition strategy suggested improve fitting accuracy real crack specified paths. several typical numerical examples demonstrate optimization methods carry ccp. selection determined tradeoff efficiency accuracy.",4 "swarm intelligence based algorithms: critical analysis. many optimization algorithms developed drawing inspiration swarm intelligence (si). si-based algorithms advantages traditional algorithms. paper, carry critical analysis si-based algorithms analyzing ways mimic evolutionary operators. also analyze ways achieving exploration exploitation algorithms using mutation, crossover selection. addition, also look algorithms using dynamic systems, self-organization markov chain framework. finally, provide discussions topics research.",12 "recognizing static signs brazilian sign language: comparing large-margin decision directed acyclic graphs, voting support vector machines artificial neural networks. paper, explore detail experiments high-dimensionality, multi-class image classification problem often found automatic recognition sign languages. here, efforts directed towards comparing characteristics, advantages drawbacks creating training support vector machines disposed directed acyclic graph artificial neural networks classify signs brazilian sign language (libras). explore different heuristics, hyperparameters multi-class decision schemes affect performance, efficiency ease use classifier. provide hyperparameter surface maps capturing accuracy efficiency, comparisons ddags 1-vs-1 svms, effects heuristics training anns resilient backpropagation. report statistically significant results using cohen's kappa statistic contingency tables.",4 "introduction ross: new representational scheme. ross (""representation, ontology, structure, star"") introduced new method knowledge representation emphasizes representational constructs physical structure. ross representational scheme includes language called ""star"" specification ontology classes. ross method also includes formal scheme called ""instance model"". instance models used area natural language meaning representation represent situations. paper provides rationale philosophical background ross method.",4 "context-aware captions context-agnostic supervision. introduce inference technique produce discriminative context-aware image captions (captions describe differences images visual concepts) using generic context-agnostic training data (captions describe concept image isolation). example, given images captions ""siamese cat"" ""tiger cat"", generate language describes ""siamese cat"" way distinguishes ""tiger cat"". key novelty show joint inference language model context-agnostic listener distinguishes closely-related concepts. first apply technique justification task, namely describe image contains particular fine-grained category opposed another closely-related category cub-200-2011 dataset. study discriminative image captioning generate language uniquely refers one two semantically-similar images coco dataset. evaluations discriminative ground truth justification human studies discriminative image captioning reveal approach outperforms baseline generative speaker-listener approaches discrimination.",4 "cubic range error model stereo vision illuminators. use low-cost depth sensors, stereo camera setup illuminators, particular interest numerous applications ranging robotics transportation mixed augmented reality. ability quantify noise crucial applications, e.g., sensor used map generation develop sensor scheduling policy multi-sensor setup. range error models provide uncertainty estimates help weigh data correctly instances range measurements taken different vantage points different sensors. weighing important fuse range data map meaningful way, i.e., high confidence data relied heavily. model derived work. show range error stereo systems integrated illuminators cubic validate proposed model experimentally off-the-shelf structured light stereo system. experiments confirm validity model simplify application type sensor robotics. proposed error model relevant stereo system low ambient light main light source located camera system. among others, case structured light stereo systems night stereo systems headlights. work, propose range error cubic range stereo systems integrated illuminators. experimental validation off-the-shelf structured light stereo system shows exponent 2.4 2.6. deviation attributed model considering shot noise.",4 "dynamic island model based spectral clustering genetic algorithm. maintain relative high diversity important avoid premature convergence population-based optimization methods. island model widely considered major approach achieve flexibility high efficiency. model maintains group sub-populations different islands allows sub-populations interact via predefined migration policies. however, current island model drawbacks. one certain number generations, different islands may retain quite similar, converged sub-populations thereby losing diversity decreasing efficiency. another drawback determining number islands maintain also challenging. meanwhile initializing many sub-populations increases randomness island model. address issues, proposed dynamic island model~(dim-sp) force island maintain different sub-populations, control number islands dynamically starts one sub-population. proposed island model outperforms three state-of-the-art island models three baseline optimization problems including job shop scheduler problem, travelling salesmen problem quadratic multiple knapsack problem.",4 "imaging time-series improve classification imputation. inspired recent successes deep learning computer vision, propose novel framework encoding time series different types images, namely, gramian angular summation/difference fields (gasf/gadf) markov transition fields (mtf). enables use techniques computer vision time series classification imputation. used tiled convolutional neural networks (tiled cnns) 20 standard datasets learn high-level features individual compound gasf-gadf-mtf images. approaches achieve highly competitive results compared nine current best time series classification approaches. inspired bijection property gasf 0/1 rescaled data, train denoised auto-encoders (da) gasf images four standard one synthesized compound dataset. imputation mse test data reduced 12.18%-48.02% compared using raw data. analysis features weights learned via tiled cnns das explains approaches work.",4 "partial functional correspondence. paper, propose method computing partial functional correspondence non-rigid shapes. use perturbation analysis show removal shape parts changes laplace-beltrami eigenfunctions, exploit prior spectral representation correspondence. corresponding parts optimization variables problem used weight functional correspondence; looking largest regular (in mumford-shah sense) parts minimize correspondence distortion. show approach cope challenging correspondence settings.",4 "learning diversify via weighted kernels classifier ensemble. classifier ensemble generally combine diverse component classifiers. however, difficult give definitive connection diversity measure ensemble accuracy. given list available component classifiers, adaptively diversely ensemble classifiers becomes big challenge literature. paper, argue diversity, direct diversity samples adaptive diversity data, highly correlated ensemble accuracy, propose novel technology classifier ensemble, learning diversify, learns adaptively combine classifiers considering accuracy diversity. specifically, approach, learning diversify via weighted kernels (l2dwk), performs classifier combination optimizing direct simple criterion: maximizing ensemble accuracy adaptive diversity simultaneously minimizing convex loss function. given measure formulation, diversity calculated weighted kernels (i.e., diversity measured component classifiers' outputs kernelled weighted), kernel weights automatically learned. minimize loss function estimating kernel weights conjunction classifier weights, propose self-training algorithm conducting convex optimization procedure iteratively. extensive experiments variety 32 uci classification benchmark datasets show proposed approach consistently outperforms state-of-the-art ensembles bagging, adaboost, random forests, gasen, regularized selective ensemble, ensemble pruning via semi-definite programming.",4 "learning remember translation history continuous cache. existing neural machine translation (nmt) models generally translate sentences isolation, missing opportunity take advantage document-level information. work, propose augment nmt models light-weight cache-like memory network, stores recent hidden representations translation history. probability distribution generated words updated online depending translation history retrieved memory, endowing nmt models capability dynamically adapt time. experiments multiple domains different topics styles show effectiveness proposed approach negligible impact computational cost.",4 "learning compact recurrent neural networks block-term tensor decomposition. recurrent neural networks (rnns) powerful sequence modeling tools. however, dealing high dimensional inputs, training rnns becomes computational expensive due large number model parameters. hinders rnns solving many important computer vision tasks, action recognition videos image captioning. overcome problem, propose compact flexible structure, namely block-term tensor decomposition, greatly reduces parameters rnns improves training efficiency. compared alternative low-rank approximations, tensor-train rnn (tt-rnn), method, block-term rnn (bt-rnn), concise (when using rank), also able attain better approximation original rnns much fewer parameters. three challenging tasks, including action recognition videos, image captioning image generation, bt-rnn outperforms tt-rnn standard rnn terms prediction accuracy convergence rate. specifically, bt-lstm utilizes 17,388 times fewer parameters standard lstm achieve accuracy improvement 15.6\% action recognition task ucf11 dataset.",4 "neural algorithm artistic style. fine art, especially painting, humans mastered skill create unique visual experiences composing complex interplay content style image. thus far algorithmic basis process unknown exists artificial system similar capabilities. however, key areas visual perception object face recognition near-human performance recently demonstrated class biologically inspired vision models called deep neural networks. introduce artificial system based deep neural network creates artistic images high perceptual quality. system uses neural representations separate recombine content style arbitrary images, providing neural algorithm creation artistic images. moreover, light striking similarities performance-optimised artificial neural networks biological vision, work offers path forward algorithmic understanding humans create perceive artistic imagery.",4 "improving term extraction terminological resources. studies different term extractors corpus biomedical domain revealed decreasing performances applied highly technical texts. difficulty impossibility customising new domains additional limitation. paper, propose use external terminologies influence generic linguistic data order augment quality extraction. tool implemented exploits testified terms different steps process: chunking, parsing extraction term candidates. experiments reported show that, using method, term candidates acquired higher level reliability. describe extraction process involving endogenous disambiguation implemented term extractor yatea.",4 "persona-based neural conversation model. present persona-based models handling issue speaker consistency neural response generation. speaker model encodes personas distributed embeddings capture individual characteristics background information speaking style. dyadic speaker-addressee model captures properties interactions two interlocutors. models yield qualitative performance improvements perplexity bleu scores baseline sequence-to-sequence models, similar gains speaker consistency measured human judges.",4 "automatic summarization online debates. debate summarization one novel challenging research areas automatic text summarization largely unexplored. paper, develop debate summarization pipeline summarize key topics discussed argued two opposing sides online debates. view generation debate summaries achieved clustering, cluster labeling, visualization. work, investigate two different clustering approaches generation summaries. first approach, generate summaries applying purely term-based clustering cluster labeling. second approach makes use x-means clustering mutual information labeling clusters. approaches driven ontologies. visualize results using bar charts. think results smooth entry users aiming receive first impression discussed within debate topic containing waste number argumentations.",4 fuzzy vault fingerprints vulnerable brute force attack. \textit{fuzzy vault} approach one best studied well accepted ideas binding cryptographic security biometric authentication. vault implemented connection fingerprint data uludag jain. show instance vault vulnerable brute force attack. interceptor vault data recover secret template data using generally affordable computational resources. possible alternatives discussed suggested cryptographic security may preferable one - way function approach biometric security.,4 "hacking smart machines smarter ones: extract meaningful data machine learning classifiers. machine learning (ml) algorithms used train computers perform variety complex tasks improve experience. computers learn recognize patterns, make unintended decisions, react dynamic environment. certain trained machines may effective others based suitable ml algorithms trained superior training sets. although ml algorithms known publicly released, training sets may reasonably ascertainable and, indeed, may guarded trade secrets. much research performed privacy elements training sets, paper focus attention ml classifiers statistical information unconsciously maliciously revealed them. show possible infer unexpected useful information ml classifiers. particular, build novel meta-classifier train hack classifiers, obtaining meaningful information training sets. kind information leakage exploited, example, vendor build effective classifiers simply acquire trade secrets competitor's apparatus, potentially violating intellectual property rights.",4 "iterative object part transfer fine-grained recognition. aim fine-grained recognition identify sub-ordinate categories images like different species birds. existing works confirmed that, order capture subtle differences across categories, automatic localization objects parts critical. approaches object part localization relied bottom-up pipeline, thousands region proposals generated filtered pre-trained object/part models. computationally expensive scalable number objects/parts becomes large. paper, propose nonparametric data-driven method object part localization. given unlabeled test image, approach transfers annotations similar images retrieved training set. particular, propose iterative transfer strategy gradually refine predicted bounding boxes. based located objects parts, deep convolutional features extracted recognition. evaluate approach widely-used cub200-2011 dataset new large dataset called birdsnap. datasets, achieve better results many state-of-the-art approaches, including using oracle (manually annotated) bounding boxes test images.",4 "$\mathcal{o}(n\log n)$ projection operator weighted $\ell_1$-norm regularization sum constraint. provide simple efficient algorithm projection operator weighted $\ell_1$-norm regularization subject sum constraint, together elementary proof. implementation proposed algorithm downloaded author's homepage.",4 "bounded recursive self-improvement. designed machine becomes increasingly better behaving underspecified circumstances, goal-directed way, job, modeling environment experience accumulates. based principles autocatalysis, endogeny, reflectivity, work provides architectural blueprint constructing systems high levels operational autonomy underspecified circumstances, starting small seed. value-driven dynamic priority scheduling controlling parallel execution vast number reasoning threads, system achieves recursive self-improvement leaves lab, within boundaries imposed designers. prototype system implemented demonstrated learn complex real-world task, real-time multimodal dialogue humans, on-line observation. work presents solutions several challenges must solved achieving artificial general intelligence.",4 noisy expectation-maximization: applications generalizations. present noise-injected version expectation-maximization (em) algorithm: noisy expectation maximization (nem) algorithm. nem algorithm uses noise speed convergence em algorithm. nem theorem shows injected noise speeds average convergence em algorithm local maximum likelihood surface positivity condition holds. generalized form noisy expectation-maximization (nem) algorithm allow arbitrary modes noise injection including adding multiplying noise data. demonstrate noise benefits em algorithms gaussian mixture model (gmm) additive multiplicative nem noise injection. separate theorem (not presented here) shows noise benefit independent identically distributed additive noise decreases sample size mixture models. theorem implies noise benefit pronounced data sparse. injecting blind noise slowed convergence.,19 "theoretical analysis ndcg type ranking measures. central problem ranking design ranking measure evaluation ranking functions. paper study, theoretical perspective, widely used normalized discounted cumulative gain (ndcg)-type ranking measures. although extensive empirical studies ndcg, little known theoretical properties. first show that, whatever ranking function is, standard ndcg adopts logarithmic discount, converges 1 number items rank goes infinity. first sight, result surprising. seems imply ndcg cannot differentiate good bad ranking functions, contradicting empirical success ndcg many applications. order deeper understanding ranking measures general, propose notion referred consistent distinguishability. notion captures intuition ranking measure property: every pair substantially different ranking functions, ranking measure decide one better consistent manner almost datasets. show ndcg logarithmic discount consistent distinguishability although converges limit ranking functions. next characterize set feasible discount functions ndcg according concept consistent distinguishability. specifically show whether ndcg consistent distinguishability depends fast discount decays, 1/r critical point. turn cut-off version ndcg, i.e., ndcg@k. analyze distinguishability ndcg@k various choices k discount functions. experimental results real web search datasets agree well theory.",4 "hierarchical latent word clustering. paper presents new bayesian non-parametric model extending usage hierarchical dirichlet allocation extract tree structured word clusters text data. inference algorithm model collects words cluster share similar distribution documents. experiments, observed meaningful hierarchical structures nips corpus radiology reports collected public repositories.",4 "bio-inspired data mining: treating malware signatures biosequences. application machine learning bioinformatics problems well established. less well understood application bioinformatics techniques machine learning and, particular, representation non-biological data biosequences. aim paper explore effects giving amino acid representation problematic machine learning data evaluate benefits supplementing traditional machine learning bioinformatics tools techniques. signatures 60 computer viruses 60 computer worms converted amino acid representations first multiply aligned separately identify conserved regions across different families within class (virus worm). followed second alignment 120 aligned signatures together non-conserved regions identified prior input number machine learning techniques. differences length virus worm signatures first alignment resolved second alignment. first set experiments indicates representing computer malware signatures amino acid sequences followed alignment leads greater classification prediction accuracy. second set experiments indicates checking results data mining artificial virus worm data known proteins lead generalizations made domain naturally occurring proteins malware signatures. however, work needed determine advantages disadvantages different representations sequence alignment methods handling problematic machine learning data.",4 "fast rates bandit optimization upper-confidence frank-wolfe. consider problem bandit optimization, inspired stochastic optimization online learning problems bandit feedback. problem, objective minimize global loss function actions, necessarily cumulative loss. framework allows us study general class problems, applications statistics, machine learning, fields. solve problem, analyze upper-confidence frank-wolfe algorithm, inspired techniques bandits convex optimization. give theoretical guarantees performance algorithm various classes functions, discuss optimality results.",4 "mutual kernel matrix completion. huge influx various data nowadays, extracting knowledge become interesting tedious task among data scientists, particularly data come heterogeneous form missing information. many data completion techniques introduced, especially advent kernel methods. however, among many data completion techniques available literature, studies mutually completing several incomplete kernel matrices given much attention yet. paper, present new method, called mutual kernel matrix completion (mkmc) algorithm, tackles problem mutually inferring missing entries multiple kernel matrices combining notions data fusion kernel matrix completion, applied biological data sets used classification task. first introduced objective function minimized exploiting em algorithm, turn results estimate missing entries kernel matrices involved. completed kernel matrices combined produce model matrix used improve obtained estimates. interesting result study e-step m-step given closed form, makes algorithm efficient terms time memory. completion, (completed) kernel matrices used train svm classifier test well relationships among entries preserved. empirical results show proposed algorithm bested traditional completion techniques preserving relationships among data points, accurately recovering missing kernel matrix entries. far, mkmc offers promising solution problem mutual estimation number relevant incomplete kernel matrices.",4 "fractal structures adversarial prediction. fractals self-similar recursive structures used modeling several real world processes. work study ""fractal-like"" processes arise prediction game adversary generating sequence bits algorithm trying predict them. see certain formalization predictive payoff algorithm optimal adversary produce fractal-like sequence minimize algorithm's ability predict. indeed suggested financial markets exhibit fractal-like behavior. prove fractal-like distribution arises naturally optimization adversary's perspective. addition, give optimal trade-offs predictability expected deviation (i.e. sum bits) formalization predictive payoff. result motivated observation several time series data exhibit higher deviations expected completely random walk.",4 "oblivious branching programs bounded repetition cannot efficiently compute cnfs bounded treewidth. paper study complexity extension ordered binary decision diagrams (obdds) called $c$-obdds cnfs bounded (primal graph) treewidth. particular, show $k$ class cnfs treewidth $k \geq 3$ equivalent $c$-obdds size $\omega(n^{k/(8c-4)})$. moreover, lower bound holds $c$-obdd non-deterministic semantic. second result uses lower bound separate model sentential decision diagrams (sdds). order obtain lower bound, use structural graph parameter called matching width. third result shows matching width pathwidth linearly related.",4 "mining process model descriptions daily life event abstraction. process mining techniques focus extracting insight processes event logs. process mining potential provide valuable insights (un)healthy habits contribute ambient assisted living solutions applied data smart home environments. however, events recorded smart home environments level sensor triggers, process discovery algorithms produce overgeneralizing process models allow much behavior difficult interpret human experts. show abstracting events higher-level interpretation enable discovery precise comprehensible models. present framework extraction features used abstraction supervised learning methods based xes ieee standard event logs. framework automatically abstract sensor-level events interpretation human activity level, training training data sensor human activity events known. demonstrate abstraction framework three real-life smart home event logs show process models discovered abstraction precise indeed.",4 "learning sparse deep feedforward networks via tree skeleton expansion. despite popularity deep learning, structure learning deep models remains relatively under-explored area. contrast, structure learning studied extensively probabilistic graphical models (pgms). particular, efficient algorithm developed learning class tree-structured pgms called hierarchical latent tree models (hltms), layer observed variables bottom multiple layers latent variables top. paper, propose simple method learning structures feedforward neural networks (fnns) based hltms. idea expand connections tree skeletons hltms use resulting structures fnns. important characteristic fnn structures learned way sparse. present extensive empirical results show that, compared standard fnns tuned-manually, sparse fnns learned method achieve better comparable classification performance much fewer parameters. also interpretable.",4 "pos tagger code mixed indian social media text - icon-2016 nlp tools contest entry surukam. building part-of-speech (pos) taggers code-mixed indian languages particularly challenging problem computational linguistics due dearth accurately annotated training corpora. icon, part nlp tools contest organized challenge shared task second consecutive year improve state-of-the-art. paper describes pos tagger built surukam predict coarse-grained fine-grained pos tags three language pairs - bengali-english, telugu-english hindi-english, text spanning three popular social media platforms - facebook, whatsapp twitter. employed conditional random fields sequence tagging algorithm used library called sklearn-crfsuite - thin wrapper around crfsuite training model. among features used include - character n-grams, language information patterns emoji, number, punctuation web-address. submissions constrained environment,i.e., without making use monolingual pos taggers like, obtained overall average f1-score 76.45%, comparable 2015 winning score 76.79%.",4 "fast rhetorical structure theory discourse parsing. recent years, variety research discourse parsing, particularly rst discourse parsing. recent work rst parsing focused implementing new types features learning algorithms order improve accuracy, relatively little focus efficiency, robustness, practical use. also, implementations widely available. here, describe rst segmentation parsing system adapts models feature sets various previous work, described below. accuracy near state-of-the-art, developed fast, robust, practical. example, process short documents news articles essays less second.",4 "analysis spectrum occupancy using machine learning algorithms. paper, analyze spectrum occupancy using different machine learning techniques. supervised techniques (naive bayesian classifier (nbc), decision trees (dt), support vector machine (svm), linear regression (lr)) unsupervised algorithm (hidden markov model (hmm)) studied find best technique highest classification accuracy (ca). detailed comparison supervised unsupervised algorithms terms computational time classification accuracy performed. classified occupancy status utilized evaluate probability secondary user outage future time slots, used system designers define spectrum allocation spectrum sharing policies. numerical results show svm best algorithm among supervised unsupervised classifiers. based this, proposed new svm algorithm combining fire fly algorithm (ffa), shown outperform algorithms.",4 "identifying purpose behind electoral tweets. tweets pertaining single event, national election, number hundreds millions. automatically analyzing beneficial many downstream natural language applications question answering summarization. paper, propose new task: identifying purpose behind electoral tweets--why people post election-oriented tweets? show identifying purpose correlated related phenomenon sentiment emotion detection, yet significantly different. detecting purpose number applications including detecting mood electorate, estimating popularity policies, identifying key issues contention, predicting course events. create large dataset electoral tweets annotate thousand tweets purpose. develop system automatically classifies electoral tweets per purpose, obtaining accuracy 43.56% 11-class task accuracy 73.91% 3-class task (both accuracies well most-frequent-class baseline). finally, show resources developed emotion detection also helpful detecting purpose.",4 "ordinal rating network performance inference matrix completion. paper addresses large-scale acquisition end-to-end network performance. made two distinct contributions: ordinal rating network performance inference matrix completion. former reduces measurement costs unifies various metrics eases processing applications. latter enables scalable accurate inference requirement structural information network geometric constraints. combining both, acquisition problem bears strong similarities recommender systems. paper investigates applicability various matrix factorization models used recommender systems. found simple regularized matrix factorization practical also produces accurate results beneficial peer selection.",4 "evolving intraday foreign exchange trading strategies utilizing multiple instruments price series. propose genetic programming architecture generation foreign exchange trading strategies. system's principal features evolution free-form strategies rely prior models utilization price series multiple instruments input data. latter feature constitutes innovation respect previous works documented literature. article utilize open, high, low, close bar data 5 minutes frequency aud.usd, eur.usd, gbp.usd usd.jpy currency pairs. test implementation analyzing in-sample out-of-sample performance strategies trading usd.jpy obtained across multiple algorithm runs. also evaluate differences strategies selected according two different criteria: one relies fitness obtained training set only, second one makes use additional validation dataset. strategy activity trade accuracy remarkably stable sample results. profitability aspect, two criteria result strategies successful out-of-sample data exhibiting different characteristics. overall best performing out-of-sample strategy achieves yearly return 19%.",4 "fear bit flips: optimized coding strategies binary classification. trained, classifiers must often operate data corrupted noise. paper, consider impact noise features binary classifiers. inspired tools classifier robustness, introduce classification probability (scp) measure resulting distortion classifier outputs. introduce low-complexity estimate scp based quantization polynomial multiplication. also study channel coding techniques based replication error-correcting codes. contrast traditional channel coding approach, error-correction meant preserve data agnostic application, schemes specifically aim maximize scp (equivalently minimizing distortion classifier output) redundancy overhead.",19 "sleeping beauty reconsidered: conditioning reflection asynchronous systems. careful analysis conditioning sleeping beauty problem done, using formal model reasoning knowledge probability developed halpern tuttle. sleeping beauty problem viewed revealing problems conditioning presence imperfect recall, analysis done reveals problems much due imperfect recall asynchrony. implications analysis van fraassen's reflection principle savage's sure-thing principle considered.",4 "discussion among different methods updating model filter object tracking. discriminative correlation filters (dcf) recently shown excellent performance visual object tracking area. paper, summarize methods updating model filter discriminative correlation filter (dcf) based tracking algorithms analyzes similarities differences among methods. deduce relationship among updating coefficient high dimension (kernel trick), updating filter frequency domain updating filter spatial domain, analyze difference among different ways. also analyze difference updating filter directly updating filter's numerator (object response power) updating filter's denominator (filter's power). experiments comparing different updating methods visualizing template filters used prove derivation.",4 "finding influential training samples gradient boosted decision trees. address problem finding influential training samples particular case tree ensemble-based models, e.g., random forest (rf) gradient boosted decision trees (gbdt). natural way formalizing problem studying model's predictions change upon leave-one-out retraining, leaving individual training sample. recent work shown that, parametric models, analysis conducted computationally efficient way. propose several ways extending framework non-parametric gbdt ensembles assumption tree structures remain fixed. furthermore, introduce general scheme obtaining approximations method balance trade-off performance computational complexity. evaluate approaches various experimental setups use-case scenarios demonstrate quality approach finding influential training samples comparison baselines computational efficiency.",4 "political homophily independence movements: analysing classifying social media users national identity. social media data mining increasingly used analyse political societal issues. undertake classification social media users supporting opposing ongoing independence movements territories. independence movements occur territories whose citizens conflicting national identities; users opposing national identities support oppose sense part independent nation differs officially recognised country. describe methodology relies users' self-reported location build large-scale datasets three territories -- catalonia, basque country scotland. analysis datasets shows homophily plays important role determining people connect with, users predominantly choose follow interact others national identity. show classifier relying users' follow networks achieve accurate, language-independent classification performances ranging 85% 97% three territories.",4 "improved speech reconstruction silent video. speechreading task inferring phonetic information visually observed articulatory facial movements, notoriously difficult task humans perform. paper present end-to-end model based convolutional neural network (cnn) generating intelligible natural-sounding acoustic speech signal silent video frames speaking person. train model speakers grid tcd-timit datasets, evaluate quality intelligibility reconstructed speech using common objective measurements. show speech predictions proposed model attain scores indicate significantly improved quality existing models. addition, show promising results towards reconstructing speech unconstrained dictionary.",4 "translating answer-set programs bit-vector logic. answer set programming (asp) paradigm declarative problem solving problems first formalized rule sets, i.e., answer-set programs, uniform way solved computing answer sets programs. satisfiability modulo theories (smt) framework follows similar modelling philosophy syntax based extensions propositional logic rather rules. quite recently, translation answer-set programs difference logic provided---enabling use particular smt solvers computation answer sets. paper, translation revised another smt fragment, namely based fixed-width bit-vector theories. thus, even smt solvers harnessed task computing answer sets. results preliminary experimental comparison also reported. suggest level performance similar achieved via difference logic.",4 "novel energy aware node clustering algorithm wireless sensor networks using modified artificial fish swarm algorithm. clustering problems considered amongst prominent challenges statistics computational science. clustering nodes wireless sensor networks used prolong life-time networks one difficult tasks clustering procedure. order perform nodes clustering, number nodes determined cluster heads ones joined one heads, based different criteria e.g. euclidean distance. far, different approaches proposed process, swarm evolutionary algorithms contribute regard. study, novel algorithm proposed based artificial fish swarm algorithm (afsa) clustering procedure. proposed method, performance standard afsa improved increasing balance local global searches. furthermore, new mechanism added base algorithm improving convergence speed clustering problems. performance proposed technique compared number state-of-the-art techniques field outcomes indicate supremacy proposed technique.",4 "value iteration algorithm strongly polynomial discounted dynamic programming. note provides simple example demonstrating that, exact computations allowed, number iterations required value iteration algorithm find optimal policy discounted dynamic programming problems may grow arbitrarily quickly size problem. particular, number iterations exponential number actions. thus, unlike policy iterations, value iteration algorithm strongly polynomial discounted dynamic programming.",4 "notes information geometry evolutionary processes. order analyze extract different structural properties distributions, one introduce different coordinate systems manifold distributions. evolutionary computation, walsh bases building block bases often used describe populations, simplifies analysis evolutionary operators applying populations. quite independent approaches, information geometry developed geometric way analyze different order dependencies random variables (e.g., neural activations genes). notes briefly review essentials various coordinate bases information geometry. goal give overview make approaches comparable. besides introducing meaningful coordinate bases, information geometry also offers explicit way distinguish different order interactions offers geometric view manifold thereby also operators apply manifold. instance, uniform crossover interpreted orthogonal projection population along m-geodesic, monotonously reducing theta-coordinates describe interactions genes.",13 "information extraction approach prescreen heart failure patients clinical trials. reduce large amount time spent screening, identifying, recruiting patients clinical trials, need prescreening systems able automate data extraction decision-making tasks typically relegated clinical research study coordinators. however, major obstacle vast amount patient data available unstructured free-form text electronic health records. propose information extraction-based approach first automatically converts unstructured text structured form. structured data compared list eligibility criteria using rule-based system determine patients qualify enrollment heart failure clinical trial. show achieve highly accurate results, recall precision values 0.95 0.86, respectively. system allowed us significantly reduce time needed prescreening patients weeks minutes. open-source information extraction modules available researchers could tested validated cardiovascular trials. approach one demonstrate may decrease costs expedite clinical trials, could enhance reproducibility trials across institutions populations.",4 "deep neural networks match related objects?: survey imagenet-trained classification models. deep neural networks (dnns) shown state-of-the-art level performances wide range complicated tasks. recent years, studies actively conducted analyze black box characteristics dnns grasp learning behaviours, tendency, limitations dnns. paper, investigate limitation dnns image classification task verify method inspired cognitive psychology. analyzing failure cases imagenet classification task, hypothesize dnns sufficiently learn associate related classes objects. verify dnns understand relatedness object classes, conducted experiments image database provided cognitive psychology. applied imagenet-trained dnns database consisting pairs related unrelated object images compare feature similarities determine whether pairs match other. experiments, observed dnns show limited performance determining relatedness object classes. addition, dnns present somewhat improved performance discovering relatedness based similarity, perform weaker discovering relatedness based association. experiments, novel analysis learning behaviour dnns provided limitation needs overcome suggested.",4 "recognizing textures mobile cameras pedestrian safety applications. smartphone rooted distractions become commonplace, lack compelling safety measures led rise number injuries distracted walkers. various solutions address problem sensing pedestrian's walking environment. existing camera-based approaches largely limited obstacle detection forms object detection. instead, present terrafirma, approach performs material recognition pedestrian's walking surface. explore, first, well commercial off-the-shelf smartphone cameras learn texture distinguish among paving materials uncontrolled outdoor urban settings. second, aim identifying distracted user enter street, used support safety functions warning user cautious. end, gather unique dataset street/sidewalk imagery pedestrian's perspective, spans major cities like new york, paris, london. demonstrate modern phone cameras enabled distinguish materials walking surfaces urban areas 90% accuracy, accurately identify pedestrians transition sidewalk street.",4 "quality geographic information: ontological approach artificial intelligence tools. objective present one important aspect european ist-fet project ""rev!gis""1: methodology developed translation (interpretation) quality data ""fitness use"" information, confront user needs application. methodology based upon notion ""ontologies"" conceptual framework able capture explicit implicit knowledge involved application. address general problem formalizing ontologies, instead, rather try illustrate three applications particular cases general ""data fusion"" problem. application, show deploy methodology, comparing several possible solutions, try enlighten quality issues, kind solution privilege, even expense highly complex computational approach. expectation rev!gis project computationally tractable solutions available among next generation ai tools.",4 "domain adaptation randomized expectation maximization. domain adaptation (da) task classifying unlabeled dataset (target) using labeled dataset (source) related domain. majority successful da methods try directly match distributions source target data transforming feature space. despite success, state art methods based approach either involved unable directly scale data many features. article shows domain adaptation successfully performed using simple randomized expectation maximization (em) method. consider two instances method, involve logistic regression support vector machine, respectively. underlying assumption proposed method existence good single linear classifier source target domain. potential limitations assumption alleviated flexibility method, directly incorporate deep features extracted pre-trained deep neural network. resulting algorithm strikingly easy implement apply. test performance 36 real-life adaptation tasks text image data diverse characteristics. method achieves state-of-the-art results, competitive involved end-to-end deep transfer-learning methods.",19 "adapting shifting intent search queries. search engines today present results often oblivious abrupt shifts intent. example, query `independence day' usually refers us holiday, intent query abruptly changed release major film name. studies exactly quantify magnitude intent-shifting traffic, studies suggest news events, seasonal topics, pop culture, etc account 50% search queries. paper shows signals search engine receives used determine shift intent happened, well find result relevant. present meta-algorithm marries classifier bandit algorithm achieve regret depends logarithmically number query impressions, certain assumptions. provide strong evidence regret close best achievable. finally, via series experiments, demonstrate algorithm outperforms prior approaches, particularly amount intent-shifting traffic increases.",4 "spp-net: deep absolute pose regression synthetic views. image based localization one important problems computer vision due wide applicability robotics, augmented reality, autonomous systems. rich set methods described literature geometrically register 2d image w.r.t.\ 3d model. recently, methods based deep (and convolutional) feedforward networks (cnns) became popular pose regression. however, cnn-based methods still less accurate geometry based methods despite fast memory efficient. work design deep neural network architecture based sparse feature descriptors estimate absolute pose image. choice using sparse feature descriptors two major advantages: first, network significantly smaller cnns proposed literature task---thereby making approach efficient scalable. second---and importantly---, usage sparse features allows augment training data synthetic viewpoints, leads substantial improvements generalization performance unseen poses. thus, proposed method aims combine best two worlds---feature-based localization cnn-based pose regression--to achieve state-of-the-art performance absolute pose estimation. detailed analysis proposed architecture rigorous evaluation existing datasets provided support method.",4 "tensor principal component analysis via sum-of-squares proofs. study statistical model tensor principal component analysis problem introduced montanari richard: given order-$3$ tensor $t$ form $t = \tau \cdot v_0^{\otimes 3} + a$, $\tau \geq 0$ signal-to-noise ratio, $v_0$ unit vector, $a$ random noise tensor, goal recover planted vector $v_0$. case $a$ iid standard gaussian entries, give efficient algorithm recover $v_0$ whenever $\tau \geq \omega(n^{3/4} \log(n)^{1/4})$, certify recovered vector close maximum likelihood estimator, high probability random choice $a$. previous best algorithms provable guarantees required $\tau \geq \omega(n)$. regime $\tau \leq o(n)$, natural tensor-unfolding-based spectral relaxations underlying optimization problem break (in sense integrality gap large). go beyond barrier, use convex relaxations based sum-of-squares method. recovery algorithm proceeds rounding degree-$4$ sum-of-squares relaxations maximum-likelihood-estimation problem statistical model. complement algorithmic results, show degree-$4$ sum-of-squares relaxations break $\tau \leq o(n^{3/4}/\log(n)^{1/4})$, demonstrates improving current guarantees (by logarithmic factors) would require new techniques might even intractable. finally, show exploit additional problem structure order solve sum-of-squares relaxations, approximation, efficiently. fastest algorithm runs nearly-linear time using shifted (matrix) power iteration similar guarantees above. analysis algorithm also confirms variant conjecture montanari richard singular vectors tensor unfoldings.",4 "constructing category-specific models monocular object-slam. present new paradigm real-time object-oriented slam monocular camera. contrary previous approaches, rely object-level models, construct category-level models cad collections widely available. alleviate need huge amounts labeled data, develop rendering pipeline enables synthesis large datasets limited amount manually labeled data. using data thus synthesized, learn category-level models object deformations 3d, well discriminative object features 2d. category models instance-independent aid design object landmark observations incorporated generic monocular slam framework. typical object-slam approaches usually solve object camera poses, also estimate object shape on-the-fly, allowing wide range objects category present scene. moreover, since 2d object features learned discriminatively, proposed object-slam system succeeds several scenarios sparse feature-based monocular slam fails due insufficient features parallax. also, proposed category-models help object instance retrieval, useful augmented reality (ar) applications. evaluate proposed framework multiple challenging real-world scenes show --- best knowledge --- first results instance-independent monocular object-slam system benefits enjoys feature-based slam methods.",4 "phase-only planar antenna array synthesis fuzzy genetic algorithms. paper describes new method synthesis planar antenna arrays using fuzzy genetic algorithms (fgas) optimizing phase excitation coefficients best meet desired radiation pattern. present application rigorous optimization technique based fuzzy genetic algorithms (fgas), optimizing algorithm obtained adjusting control parameters standard version genetic algorithm (sgas) using fuzzy controller (flc) depending best individual fitness population diversity measurements (pdm). presented optimization algorithms previously checked specific mathematical test function show superior capabilities respect standard version (sgas). planar array rectangular cells using probe feed considered. included example using fga demonstrates good agreement desired calculated radiation patterns obtained sga.",4 "domain-independent algorithm plan adaptation. paradigms transformational planning, case-based planning, plan debugging involve process known plan adaptation - modifying repairing old plan solves new problem. paper provide domain-independent algorithm plan adaptation, demonstrate sound, complete, systematic, compare adaptation algorithms literature. approach based view planning searching graph partial plans. generative planning starts graph's root moves node node using plan-refinement operators. planning adaptation, library plan - arbitrary node plan graph - starting point search, plan-adaptation algorithm apply refinement operators available generative planner also retract constraints steps plan. algorithm's completeness ensures adaptation algorithm eventually search entire graph systematicity ensures without redundantly searching parts graph.",4 "optical images-based edge detection synthetic aperture radar images. address issue adapting optical images-based edge detection techniques use polarimetric synthetic aperture radar (polsar) imagery. modify gravitational edge detection technique (inspired law universal gravity) proposed lopez-molina et al, using non-standard neighbourhood configuration proposed fu et al, reduce speckle noise polarimetric sar imagery. compare modified unmodified versions gravitational edge detection technique well-established one proposed canny, well recent multiscale fuzzy-based technique proposed lopez-molina et alejandro also address issues aggregation gray level images edge detection filtering. techniques addressed applied mosaic built using class distributions obtained real scene, well true polsar image; mosaic results assessed using baddeley's delta metric. experiments show modifying gravitational edge detection technique non-standard neighbourhood configuration produces better results original technique, well techniques used comparison. experiments show adapting edge detection methods computational intelligence use polsar imagery new field worthy exploration.",4 "minimally faithful inversion graphical models. inference amortization methods allow sharing statistical strength across related observations learning perform posterior inference. generally requires inversion dependency structure generative model, modeller must design learn distribution approximate posterior. previous methods invert dependency structure heuristic way fail capture dependencies model, therefore limiting performance eventual inference algorithm. introduce algorithm faithfully minimally inverting graphical model structure generative model. inversion two crucial properties: a) encode independence assertions absent model, b) given inversion, encodes many true independence assertions possible. algorithm works simulating variable elimination generative model reparametrize distribution. show experiments minimal inversions assist performing better inference.",19 "towards new science clinical data intelligence. paper define clinical data intelligence analysis data generated clinical routine goal improving patient care. define science clinical data intelligence data analysis permits derivation scientific, i.e., generalizable reliable results. argue science clinical data intelligence sensible context big data analysis, i.e., data many patients complete patient information. discuss clinical data intelligence requires joint efforts knowledge engineering, information extraction (from textual unstructured data), statistics statistical machine learning. describe main results conjectures relate recently funded research project involving two major german university hospitals.",4 "uncertainty measurement belief entropy interference effect quantum-like bayesian networks. social dilemmas regarded essence evolution game theory, prisoner's dilemma game famous metaphor problem cooperation. recent findings revealed people's behavior violated sure thing principle games. classic probability methodologies difficulty explaining underlying mechanisms people's behavior. paper, novel quantum-like bayesian network proposed accommodate paradoxical phenomenon. special network take interference consideration, likely efficient way describe underlying mechanism. assistance belief entropy, named deng entropy, paper proposes belief distance render model practical. tested empirical data, proposed model proved predictable effective.",4 "human-in-the-loop artificial intelligence. little little, newspapers revealing bright future artificial intelligence (ai) building. intelligent machines help everywhere. however, bright future dark side: dramatic job market contraction unpredictable transformation. hence, near future, large numbers job seekers need financial support catching novel unpredictable jobs. possible job market crisis antidote inside. fact, rise ai sustained biggest knowledge theft recent years. learning ai machines extracting knowledge unaware skilled unskilled workers analyzing interactions. passionately jobs, workers digging graves. paper, propose human-in-the-loop artificial intelligence (hit-ai) fairer paradigm artificial intelligence systems. hit-ai reward aware unaware knowledge producers different scheme: decisions ai systems generating revenues repay legitimate owners knowledge used taking decisions. modern robin hoods, hit-ai researchers fight fairer artificial intelligence gives back steals.",4 "using state space differential geometry nonlinear blind source separation. given time series multicomponent measurements evolving stimulus, nonlinear blind source separation (bss) seeks find ""source"" time series, comprised statistically independent combinations measured components. paper, seek source time series local velocity cross correlations vanish everywhere stimulus state space. however, earlier paper local velocity correlation matrix shown constitute metric state space. therefore, nonlinear bss maps onto problem differential geometry: given metric observed measurement coordinate system, find another coordinate system metric diagonal everywhere. show determine observed data separable way, and, are, show construct required transformation source coordinate system, essentially unique except unknown rotation found applying methods linear bss. thus, proposed technique solves nonlinear bss many situations or, least, reduces linear bss, without use probabilistic, parametric, iterative procedures. paper also describes generalization methodology performs nonlinear independent subspace separation. every case, resulting decomposition observed data intrinsic property stimulus' evolution sense depend way observer chooses view (e.g., choice observing machine's sensors). words, decomposition property evolution ""real"" stimulus ""out there"" broadcasting energy observer. technique illustrated analytic numerical examples.",4 "research multiple feature fusion image retrieval algorithm based texture feature rough set theory. recently, witnessed explosive growth images complex information content. order effectively precisely retrieve desired images large-scale image database low time-consuming, propose multiple feature fusion image retrieval algorithm based texture feature rough set theory paper. contrast conventional approaches use single feature standard, fuse different features operation normalization. rough set theory assist us enhance robustness retrieval system facing incomplete data warehouse. enhance texture extraction paradigm, use wavelet gabor function holds better robustness. addition, perspectives internal external normalization, re-organize extracted feature better combination. numerical experiment verified general feasibility methodology. enhance overall accuracy compared state-of-the-art algorithms.",4 "inversenet: solving inverse problems splitting networks. propose new method uses deep learning techniques solve inverse problems. inverse problem cast form learning end-to-end mapping observed data ground-truth. inspired splitting strategy widely used regularized iterative algorithm tackle inverse problems, mapping decomposed two networks, one handling inversion physical forward model associated data term one handling denoising output former network, i.e., inverted version, associated prior/regularization term. two networks trained jointly learn end-to-end mapping, getting rid two-step training. training annealing intermediate variable two networks bridges gap input (the degraded version output) output progressively approaches ground-truth. proposed network, referred inversenet, flexible sense existing end-to-end network structure leveraged first network existing denoising network structure used second one. extensive experiments synthetic data real datasets tasks, motion deblurring, super-resolution, colorization, demonstrate efficiency accuracy proposed method compared image processing algorithms.",4 "automatic mapping french discourse connectives pdtb discourse relations. paper, present approach exploit phrase tables generated statistical machine translation order map french discourse connectives discourse relations. using approach, created concoledisco, lexicon french discourse connectives pdtb relations. evaluated lexconn, concoledisco achieves recall 0.81 average precision 0.68 concession condition relations.",4 "measuring relations concepts conceptual spaces. highly influential framework conceptual spaces provides geometric way representing knowledge. instances represented points high-dimensional space concepts represented regions space. recent mathematical formalization framework capable representing correlations different domains geometric way. paper, extend formalization providing quantitative mathematical definitions notions concept size, subsethood, implication, similarity, betweenness. considerably increases representational power formalization introducing measurable ways describing relations concepts.",4 "factored particles scalable monitoring. exact monitoring dynamic bayesian networks intractable, approximate algorithms necessary. paper presents new family approximate monitoring algorithms combine best qualities particle filtering boyen-koller methods. algorithms maintain approximate representation belief state form sets factored particles, correspond samples clusters state variables. empirical results show algorithms outperform ordinary particle filtering boyen-koller algorithm large systems.",4 "generative model group conversation. conversations non-player characters (npcs) games typically confined dialogue human player virtual agent, conversation initiated controlled player. create richer, believable environments players, need conversational behavior reflect initiative part npcs, including conversations include multiple npcs interact one another well player. describe generative computational model group conversation agents, abstract simulation discussion small group setting. define conversational interactions terms rules turn taking interruption, well belief change, sentiment change, emotional response, dependent agent personality, context, relationships. evaluate model using parameterized expressive range analysis, observing correlations simulation parameters features resulting conversations. analysis confirms, example, character personalities predict often speak, heterogeneous groups characters generate belief change.",4 "complexity optimized crossover binary representations. consider computational complexity producing best possible offspring crossover, given two solutions parents. crossover operators studied class boolean linear programming problems, boolean vector variables used solution representation. means efficient reductions optimized gene transmitting crossover problems (ogtc) show polynomial solvability ogtc maximum weight set packing problem, minimum weight set partition problem one versions simple plant location problem. study connection ogtc linear boolean programming problem maximum weight independent set problem 2-colorable hypergraph prove np-hardness several special cases ogtc problem boolean linear programming.",4 "time-dependent hierarchical dirichlet model timeline generation. timeline generation aims summarizing news different epochs telling readers event evolves. new challenge combines salience ranking novelty detection. long-term public events, main topic usually includes various aspects across different epochs aspect evolving pattern. existing approaches neglect hierarchical topic structure involved news corpus timeline generation. paper, develop novel time-dependent hierarchical dirichlet model (hdm) timeline generation. model aptly detect different levels topic information across corpus structure used sentence selection. based topic mined fro hdm, sentences selected considering different aspects relevance, coherence coverage. develop experimental systems evaluate 8 long-term events public concern. performance comparison different systems demonstrates effectiveness model terms rouge metrics.",4 "supervised learning multilayer spiking neural networks. current article introduces supervised learning algorithm multilayer spiking neural networks. algorithm presented overcomes limitations existing learning algorithms applied neurons firing multiple spikes principle applied linearisable neuron model. algorithm applied successfully various benchmarks, xor problem iris data set, well complex classifications problems. simulations also show flexibility supervised learning algorithm permits different encodings spike timing patterns, including precise spike trains encoding.",4 "semi-automatic algorithm breast mri lesion segmentation using marker-controlled watershed transformation. magnetic resonance imaging (mri) effective imaging modality identifying localizing breast lesions women. accurate precise lesion segmentation using computer-aided-diagnosis (cad) system, crucial step evaluating tumor volume quantification tumor characteristics. however, challenging task, since breast lesions sophisticated shape, topological structure, high variance intensity distribution across patients. paper, propose novel marker-controlled watershed transformation-based approach, uses brightest pixels region interest (determined experts) markers overcome challenge, accurately segment lesions breast mri. proposed approach evaluated 106 lesions, includes 64 malignant 42 benign cases. segmentation results quantified comparison ground truth labels, using dice similarity coefficient (dsc) jaccard index (ji) metrics. proposed method achieved average dice coefficient 0.7808$\pm$0.1729 jaccard index 0.6704$\pm$0.2167. results illustrate proposed method shows promise future work related segmentation classification benign malignant breast lesions.",4 "editorial first workshop mining scientific papers: computational linguistics bibliometrics. workshop ""mining scientific papers: computational linguistics bibliometrics"" (clbib 2015), co-located 15th international society scientometrics informetrics conference (issi 2015), brought together researchers bibliometrics computational linguistics order study ways bibliometrics benefit large-scale text analytics sense mining scientific papers, thus exploring interdisciplinarity bibliometrics natural language processing (nlp). goals workshop answer questions like: enhance author network analysis bibliometrics using data obtained text analytics? insights nlp provide structure scientific writing, citation networks, in-text citation analysis? workshop first step foster reflection interdisciplinarity benefits two disciplines bibliometrics natural language processing drive it.",4 "$k$-center clustering perturbation resilience. $k$-center problem canonical long-studied facility location clustering problem many applications symmetric asymmetric forms. versions problem tight approximation factors worst case instances: $2$-approximation symmetric $k$-center $o(\log^*(k))$-approximation asymmetric version. work, go beyond worst case provide strong positive results asymmetric symmetric $k$-center problems natural input stability (promise) condition called $\alpha$-perturbation resilience (bilu & linial 2012) , states optimal solution change $\alpha$-factor perturbation input distances. show assuming 2-perturbation resilience, exact solution asymmetric $k$-center problem found polynomial time. knowledge, first problem hard approximate constant factor worst case, yet optimally solved polynomial time perturbation resilience constant value $\alpha$. furthermore, prove result tight showing symmetric $k$-center $(2-\epsilon)$-perturbation resilience hard unless $np=rp$. first tight result problem perturbation resilience, i.e., first time exact value $\alpha$ problem switches np-hard efficiently computable found. results illustrate surprising relationship symmetric asymmetric $k$-center instances perturbation resilience. unlike approximation ratio, symmetric $k$-center easily solved factor $2$ asymmetric $k$-center cannot approximated constant factor, symmetric asymmetric $k$-center solved optimally resilience 2-perturbations.",4 "image restoration using autoencoding priors. propose leverage denoising autoencoder networks priors address image restoration problems. build key observation output optimal denoising autoencoder local mean true data density, autoencoder error (the difference output input trained autoencoder) mean shift vector. use magnitude mean shift vector, is, distance local mean, negative log likelihood natural image prior. image restoration, maximize likelihood using gradient descent backpropagating autoencoder error. key advantage approach need train separate networks different image restoration tasks, non-blind deconvolution different kernels, super-resolution different magnification factors. demonstrate state art results non-blind deconvolution super-resolution using autoencoding prior.",4 "home: household multimodal environment. introduce home: household multimodal environment artificial agents learn vision, audio, semantics, physics, interaction objects agents, within realistic context. home integrates 45,000 diverse 3d house layouts based suncg dataset, scale may facilitate learning, generalization, transfer. home open-source, openai gym-compatible platform extensible tasks reinforcement learning, language grounding, sound-based navigation, robotics, multi-agent learning, more. hope home better enables artificial agents learn humans do: interactive, multimodal, richly contextualized setting.",4 "classification ultrahigh-dimensional features. although much progress made classification high-dimensional features \citep{fan_fan:2008, jguo:2010, caisun:2014, prxu:2014}, classification ultrahigh-dimensional features, wherein features much outnumber sample size, defies existing work. paper introduces novel computationally feasible multivariate screening classification method ultrahigh-dimensional data. leveraging inter-feature correlations, proposed method enables detection marginally weak sparse signals recovery true informative feature set, achieves asymptotic optimal misclassification rates. also show proposed procedure provides powerful discovery boundaries compared \citet{caisun:2014} \citet{jjin:2009}. performance proposed procedure evaluated using simulation studies demonstrated via classification patients different post-transplantation renal functional types.",19 "multiset model multi-species evolution solve big deceptive problems. chapter presents smuga, integration symbiogenesis multiset genetic algorithm (muga). symbiogenetic approach used based host-parasite model novelty varying length parasites along evolutionary process. additionally, models collaborations multiple parasites single host. improve efficiency, introduced proxy evaluation parasites, saves fitness function calls exponentially reduces symbiotic collaborations produced. another novel feature consists breaking evolutionary cycle two phases: symbiotic phase phase independent evolution hosts parasites. smuga tested optimization variety deceptive functions, results one order magnitude better state art symbiotic algorithms. allowed optimize deceptive problems large sizes, showed linear scaling number iterations attain optimum.",4 "heuristic method generate better initial population evolutionary methods. initial population plays important role heuristic algorithms ga help decrease time algorithms need achieve acceptable result. furthermore, may influence quality final answer given evolutionary algorithms. paper, shall introduce heuristic method generate target based initial population possess two mentioned characteristics. efficiency proposed method shown presenting results tests benchmarks.",4 "local contrast learning. learning deep model small data yet opening challenging problem. focus one-shot classification deep learning approach based small quantity training samples. proposed novel deep learning approach named local contrast learning (lcl) based key insight human cognitive behavior human recognizes objects specific context contrasting objects context her/his memory. lcl used train deep model contrast recognizing sample couple contrastive samples randomly drawn shuffled. one-shot classification task omniglot, deep model based lcl 122 layers 1.94 millions parameters, trained tiny dataset 60 classes 20 samples per class, achieved accuracy 97.99% outperforms human state-of-the-art established bayesian program learning (bpl) trained 964 classes. lcl fundamental idea applied alleviate parametric model's overfitting resulted lack training samples.",4 "future frame prediction anomaly detection -- new baseline. anomaly detection videos refers identification events conform expected behavior. however, almost existing methods tackle problem minimizing reconstruction errors training data, cannot guarantee larger reconstruction error abnormal event. paper, propose tackle anomaly detection problem within video prediction framework. best knowledge, first work leverages difference predicted future frame ground truth detect abnormal event. predict future frame higher quality normal events, commonly used appearance (spatial) constraints intensity gradient, also introduce motion (temporal) constraint video prediction enforcing optical flow predicted frames ground truth frames consistent, first work introduces temporal constraint video prediction task. spatial motion constraints facilitate future frame prediction normal events, consequently facilitate identify abnormal events conform expectation. extensive experiments toy dataset publicly available datasets validate effectiveness method terms robustness uncertainty normal events sensitivity abnormal events.",4 "fast support vector machines using parallel adaptive shrinking distributed systems. support vector machines (svm), popular machine learning technique, applied wide range domains science, finance, social networks supervised learning. whether identifying high-risk patients health-care professionals, potential high-school students enroll college school districts, svms play major role social good. paper undertakes challenge designing scalable parallel svm training algorithm large scale systems, includes commodity multi-core machines, tightly connected supercomputers cloud computing systems. intuitive techniques improving time-space complexity including adaptive elimination samples faster convergence sparse format representation proposed. sample elimination, several heuristics {\em earliest possible} {\em lazy} elimination non-contributing samples proposed. several cases, early sample elimination might result false positive, low overhead mechanisms reconstruction key data structures proposed. algorithm heuristics implemented evaluated various publicly available datasets. empirical evaluation shows 26x speed improvement datasets sequential baseline, evaluated multiple compute nodes, improvement execution time 30-60\% readily observed number datasets parallel baseline.",4 "nonparametric regression using deep neural networks relu activation function. consider multivariate nonparametric regression model. shown estimators based sparsely connected deep neural networks relu activation function properly chosen network architecture achieve minimax rates convergence (up log n-factors) general composition assumption regression function. framework includes many well-studied structural constraints (generalized) additive models. lot flexibility network architecture, tuning parameter sparsity network. specifically, consider large networks number potential parameters much bigger sample size. analysis gives insights multilayer feedforward neural networks perform well practice. interestingly, depth (number layers) neural network architectures plays important role theory suggests scaling network depth logarithm sample size natural.",12 "exploiting multi-layer graph factorization multi-attributed graph matching. multi-attributed graph matching problem finding correspondences two sets data considering complex properties described multiple attributes. however, information multiple attributes likely oversimplified process makes integrated attribute, degrades matching accuracy. reason, multi-layer graph structure-based algorithm proposed recently. effectively avoid problem separating attributes multiple layers. nonetheless, several remaining issues scalability problem caused huge matrix describe multi-layer structure back-projection problem caused continuous relaxation quadratic assignment problem. work, propose novel multi-attributed graph matching algorithm based multi-layer graph factorization. reformulate problem solved several small matrices obtained factorizing multi-layer structure. then, solve problem using convex-concave relaxation procedure multi-layer structure. proposed algorithm exhibits better performance state-of-the-art algorithms based single-layer structure.",4 "projected subgradient methods learning sparse gaussians. gaussian markov random fields (gmrfs) useful broad range applications. paper tackle problem learning sparse gmrf high-dimensional space. approach uses l1-norm regularization inverse covariance matrix. utilize novel projected gradient method, faster previous methods practice equal best performing asymptotic complexity. also extend l1-regularized objective problem sparsifying entire blocks within inverse covariance matrix. methods generalize fairly easily case, methods not. demonstrate extensions give better generalization performance two real domains--biological network analysis 2d-shape modeling image task.",4 "one 3-parameter model testing. article offers 3-parameter model testing, 1) difference ability level examinee item difficulty; 2) examinee discrimination 3) item discrimination model parameters.",4 "iterated tabu search algorithm packing unequal circles circle. paper presents iterated tabu search algorithm (denoted its-pucc) solving problem packing unequal circles circle. algorithm exploits continuous combinatorial nature unequal circles packing problem. uses continuous local optimization method generate locally optimal packings. meanwhile, builds neighborhood structure set local minimum via two appropriate perturbation moves integrates two combinatorial optimization methods, tabu search iterated local search, systematically search good local minima. computational experiments two sets widely-used test instances prove effectiveness efficiency. first set 46 instances coming famous circle packing contest second set 24 instances widely used literature, algorithm able discover respectively 14 16 better solutions previous best-known records.",12 "copa: constrained parafac2 sparse & large datasets. parafac2 demonstrated success modeling irregular tensors, tensor dimensions vary across one modes. example scenario jointly modeling treatments across set patients varying number medical encounters, alignment events time bears clinical meaning, may also impossible align due varying length. despite recent improvements scaling unconstrained parafac2, model factors usually dense sensitive noise limits interpretability. result, following open challenges remain: a) various modeling constraints, temporal smoothness, sparsity non-negativity, needed imposed interpretable temporal modeling b) scalable approach required support constraints efficiently large datasets. tackle challenges, propose constrained parafac2 (copa) method, carefully incorporates optimization constraints temporal smoothness, sparsity, non-negativity resulting factors. efficiently support constraints, copa adopts hybrid optimization framework using alternating optimization alternating direction method multiplier (ao-admm). evaluated large electronic health record (ehr) datasets hundreds thousands patients, copa achieves significant speedups (up 36x faster) prior parafac2 approaches attempt handle subset constraints copa enables. overall, method outperforms baselines attempting handle subset constraints terms speed, achieving level accuracy.",4 "towards effective codebookless model image classification. bag-of-features (bof) model image classification thoroughly studied last decade. different widely used bof methods modeled images pre-trained codebook, alternative codebook free image modeling method, call codebookless model (clm), attracted little attention. paper, present effective clm represents image single gaussian classification. embedding gaussian manifold vector space, show simple incorporation clm linear classifier achieves competitive accuracy compared state-of-the-art bof methods (e.g., fisher vector). since clm lies high dimensional riemannian manifold, propose joint learning method low-rank transformation support vector machine (svm) classifier gaussian manifold, order reduce computational storage cost. study alleviate side effect background clutter clm, also present simple yet effective partial background removal method based saliency detection. experiments extensively conducted eight widely used databases demonstrate effectiveness efficiency clm method.",4 "efficient sum outer products dictionary learning (soup-dil) - $\ell_0$ method. sparsity natural signals images transform domain dictionary extensively exploited several applications compression, denoising inverse problems. recently, data-driven adaptation synthesis dictionaries shown promise many applications compared fixed analytical dictionary models. however, dictionary learning problems typically non-convex np-hard, usual alternating minimization approaches problems often computationally expensive, computations dominated np-hard synthesis sparse coding step. work, investigate efficient method $\ell_{0}$ ""norm""-based dictionary learning first approximating training data set sum sparse rank-one matrices using block coordinate descent approach estimate unknowns. proposed block coordinate descent algorithm involves efficient closed-form solutions. particular, sparse coding step involves simple form thresholding. provide convergence analysis proposed block coordinate descent approach. numerical experiments show promising performance significant speed-ups provided method classical k-svd scheme sparse signal representation image denoising.",4 "neural recovery machine chinese dropped pronoun. dropped pronouns (dps) ubiquitous pro-drop languages like chinese, japanese etc. previous work mainly focused painstakingly exploring empirical features dps recovery. paper, propose neural recovery machine (nrm) model recover dps chinese, avoid non-trivial feature engineering process. experimental results show proposed nrm significantly outperforms state-of-the-art approaches two heterogeneous datasets. experiment results chinese zero pronoun (zp) resolution show performance zp resolution also improved recovering zps dps.",4 "ridi: robust imu double integration. paper proposes novel data-driven approach inertial navigation, learns estimate trajectories natural human motions inertial measurement unit (imu) every smartphone. key observation human motions repetitive consist major modes (e.g., standing, walking, turning). algorithm regresses velocity vector history linear accelerations angular velocities, corrects low-frequency bias linear accelerations, integrated twice estimate positions. acquired training data ground-truth motions across multiple human subjects multiple phone placements (e.g., bag hand). qualitatively quantitatively evaluations demonstrated algorithm surprisingly shown comparable results full visual inertial navigation. knowledge, paper first integrate sophisticated machine learning techniques inertial navigation, potentially opening new line research domain data-driven inertial navigation. publicly share code data facilitate research.",4 "discrete symbolic optimization boltzmann sampling continuous neural dynamics: gradient symbolic computation. gradient symbolic computation proposed means solving discrete global optimization problems using neurally plausible continuous stochastic dynamical system. gradient symbolic dynamics involves two free parameters must adjusted function time obtain global maximizer end computation. provide summary known gsc dynamics special cases settings parameters, also establish schedule two parameters convergence correct answer occurs high probability. results put empirical results already obtained gsc sound theoretical footing.",4 "synapse cap 2017 ner challenge: fasttext crf. present system cap 2017 ner challenge named entity recognition french tweets. system leverages unsupervised learning larger dataset french tweets learn features feeding crf model. ranked first without using gazetteer structured external data, f-measure 58.89\%. best knowledge, first system use fasttext embeddings (which include subword representations) embedding-based sentence representation ner.",4 "data mining actionable knowledge: survey. data mining process consists series steps ranging data cleaning, data selection transformation, pattern evaluation visualization. one central problems data mining make mined patterns knowledge actionable. here, term actionable refers mined patterns suggest concrete profitable actions decision-maker. is, user something bring direct benefits (increase profits, reduction cost, improvement efficiency, etc.) organization's advantage. however, written comprehensive survey available topic. goal paper fill void. paper, first present two frameworks mining actionable knowledge inexplicitly adopted existing research methods. try situate research topic two different viewpoints: 1) data mining tasks 2) adopted framework. finally, specify issues either addressed insufficiently studied yet conclude paper.",4 "opinion mining relating subjective expressions annual earnings us financial statements. financial statements contain quantitative information manager's subjective evaluation firm's financial status. using information released u.s. 10-k filings. qualitative quantitative appraisals crucial quality financial decisions. extract opinioned statements reports, built tagging models based conditional random field (crf) techniques, considering variety combinations linguistic factors including morphology, orthography, predicate-argument structure, syntax, simple semantics. results show crf models reasonably effective find opinion holders experiments adopted popular mpqa corpus training testing. contribution paper identify opinion patterns multiword expressions (mwes) forms rather single word forms. find managers corporations attempt use optimistic words obfuscate negative financial performance accentuate positive financial performance. results also show decreasing earnings often accompanied ambiguous mild statements reporting year increasing earnings stated assertive positive way.",4 "back basics: bayesian extensions irt outperform neural networks proficiency estimation. estimating student proficiency important task computer based learning systems. compare family irt-based proficiency estimation methods deep knowledge tracing (dkt), recently proposed recurrent neural network model promising initial results. evaluate well model predicts student's future response given previous responses using two publicly available one proprietary data set. find irt-based methods consistently matched outperformed dkt across data sets finest level content granularity tractable trained on. hierarchical extension irt captured item grouping structure performed best overall. data sets included non-trivial autocorrelations student response patterns, temporal extension irt improved performance standard irt rnn-based method not. conclude irt-based models provide simpler, better-performing alternative existing rnn-based models student interaction data also affording interpretability guarantees due formulation bayesian probabilistic models.",4 "improving universality learnability neural programmer-interpreters combinator abstraction. overcome limitations neural programmer-interpreters (npi) universality learnability, propose incorporation combinator abstraction neural programing new npi architecture support abstraction, call combinatory neural programmer-interpreter (cnpi). combinator abstraction dramatically reduces number complexity programs need interpreted core controller cnpi, still allowing cnpi represent interpret arbitrary complex programs collaboration core components. propose small set four combinators capture pervasive programming patterns. due finiteness simplicity combinator set offloading burden interpretation core, able construct cnpi universal respect set combinatorizable programs, adequate solving algorithmic tasks. moreover, besides supervised training execution traces, cnpi trained policy gradient reinforcement learning appropriately designed curricula.",4 "soft rule ensembles statistical learning. article supervised learning problems solved using soft rule ensembles. first review importance sampling learning ensembles (isle) approach useful generating hard rules. soft rules obtained logistic regression corresponding hard rules. order deal perfect separation problem related logistic regression, firth's bias corrected likelihood used. various examples simulation results show soft rule ensembles improve predictive performance hard rule ensembles.",19 "identifying dogmatism social media: signals models. explore linguistic behavioral features dogmatism social media construct statistical models identify dogmatic comments. model based corpus reddit posts, collected across diverse set conversational topics annotated via paid crowdsourcing. operationalize key aspects dogmatism described existing psychology theories (such over-confidence), finding predictive power. also find evidence new signals dogmatism, tendency dogmatic posts refrain signaling cognitive processes. use predictive model analyze millions reddit posts, find evidence suggests dogmatism deeper personality trait, present dogmatic users across many different domains, users engage dogmatic comments tend show increases dogmatic posts themselves.",4 "jointly modeling embedding translation bridge video language. automatically describing video content natural language fundamental challenge multimedia. recurrent neural networks (rnn), models sequence dynamics, attracted increasing attention visual interpretation. however, existing approaches generate word locally given previous words visual content, relationship sentence semantics visual content holistically exploited. result, generated sentences may contextually correct semantics (e.g., subjects, verbs objects) true. paper presents novel unified framework, named long short-term memory visual-semantic embedding (lstm-e), simultaneously explore learning lstm visual-semantic embedding. former aims locally maximize probability generating next word given previous words visual content, latter create visual-semantic embedding space enforcing relationship semantics entire sentence visual content. proposed lstm-e consists three components: 2-d and/or 3-d deep convolutional neural networks learning powerful video representation, deep rnn generating sentences, joint embedding model exploring relationships visual content sentence semantics. experiments youtube2text dataset show proposed lstm-e achieves to-date best reported performance generating natural sentences: 45.3% 31.0% terms bleu@4 meteor, respectively. also demonstrate lstm-e superior predicting subject-verb-object (svo) triplets several state-of-the-art techniques.",4 "risk agoras: dialectical argumentation scientific reasoning. propose formal framework intelligent systems reason scientific domains, particular carcinogenicity chemicals, study properties. framework grounded philosophy scientific enquiry discourse, uses model dialectical argumentation. formalism enables representation scientific uncertainty conflict manner suitable qualitative reasoning domain.",4 "multigrid neural architectures. propose multigrid extension convolutional neural networks (cnns). rather manipulating representations living single spatial grid, network layers operate across scale space, pyramid grids. consume multigrid inputs produce multigrid outputs; convolutional filters within-scale cross-scale extent. aspect distinct simple multiscale designs, process input different scales. viewed terms information flow, multigrid network passes messages across spatial pyramid. consequence, receptive field size grows exponentially depth, facilitating rapid integration context. critically, multigrid structure enables networks learn internal attention dynamic routing mechanisms, use accomplish tasks modern cnns fail. experiments demonstrate wide-ranging performance advantages multigrid. cifar imagenet classification tasks, flipping single grid multigrid within standard cnn paradigm improves accuracy, compute parameter efficient. multigrid independent architectural choices; show synergy combination residual connections. multigrid yields dramatic improvement synthetic semantic segmentation dataset. strikingly, relatively shallow multigrid networks learn directly perform spatial transformation tasks, where, contrast, current cnns fail. together, results suggest continuous evolution features multigrid pyramid powerful alternative existing cnn designs flat grid.",4 "toward integrated framework automated development optimization online advertising campaigns. creating monitoring competitive cost-effective pay-per-click advertisement campaigns web-search channel resource demanding task terms expertise effort. assisting even automating work advertising specialist unrivaled commercial value. paper propose methodology, architecture, fully functional framework semi- fully- automated creation, monitoring, optimization cost-efficient pay-per-click campaigns budget constraints. campaign creation module generates automatically keywords based content web page advertised extended corresponding ad-texts. keywords used create automatically campaigns fully equipped appropriate values set. campaigns uploaded auctioneer platform start running. optimization module focuses learning process existing campaign statistics also applied strategies previous periods order invest optimally next period. objective maximize performance (i.e. clicks, actions) current budget constraint. fully functional prototype experimentally evaluated real world google adwords campaigns presents promising behavior regards campaign performance statistics outperforms systematically competing manually maintained campaigns.",4 "enabling factor analysis thousand-subject neuroimaging datasets. scale functional magnetic resonance image data rapidly increasing large multi-subject datasets becoming widely available high-resolution scanners adopted. inherent low-dimensionality information data led neuroscientists consider factor analysis methods extract analyze underlying brain activity. work, consider two recent multi-subject factor analysis methods: shared response model hierarchical topographic factor analysis. perform analytical, algorithmic, code optimization enable multi-node parallel implementations scale. single-node improvements result 99x 1812x speedups two methods, enables processing larger datasets. distributed implementations show strong scaling 3.3x 5.5x respectively 20 nodes real datasets. also demonstrate weak scaling synthetic dataset 1024 subjects, 1024 nodes 32,768 cores.",19 "meta-prod2vec - product embeddings using side-information recommendation. propose meta-prod2vec, novel method compute item similarities recommendation leverages existing item metadata. scenarios frequently encountered applications content recommendation, ad targeting web search. method leverages past user interactions items attributes compute low-dimensional embeddings items. specifically, item metadata in- jected model side information regularize item embeddings. show new item representa- tions lead better performance recommendation tasks open music dataset.",4 "automatic curation golf highlights using multimodal excitement features. production sports highlight packages summarizing game's exciting moments essential task broadcast media. yet, requires labor-intensive video editing. propose novel approach auto-curating sports highlights, use create real-world system editorial aid golf highlight reels. method fuses information players' reactions (action recognition high-fives fist pumps), spectators (crowd cheering), commentator (tone voice word analysis) determine interesting moments game. accurately identify start end frames key shot highlights additional metadata, player's name hole number, allowing personalized content summarization retrieval. addition, introduce new techniques learning classifiers reduced manual training data annotation exploiting correlation different modalities. work demonstrated major golf tournament, successfully extracting highlights live video streams four consecutive days.",4 "latent semantics action verbs reflect phonetic parameters intensity emotional content. conjuring thoughts, language reflects statistical patterns word co-occurrences turn come describe perceive world. whether counting frequently nouns verbs combine google search queries, extracting eigenvectors term document matrices made wikipedia lines shakespeare plots, resulting latent semantics capture associative links form concepts, also spatial dimensions embedded within surface structure language. shape movements objects found associated phonetic contrasts already toddlers, study explores whether articulatory acoustic parameters may likewise differentiate latent semantics action verbs. selecting 3 x 20 emotion, face, hand related verbs known activate premotor areas brain, mutual cosine similarities computed using latent semantic analysis lsa, resulting adjacency matrices compared based two different large scale text corpora; hawik tasa. applying hierarchical clustering identify common structures across two text corpora, verbs largely divide combined mouth hand movements versus emotional expressions. transforming verbs constituent phonemes, clustered small large size movements appear differentiated front versus back vowels corresponding increasing levels arousal. whereas clustered emotional verbs seem characterized sequences close versus open jaw produced phonemes, generating up- downwards shifts formant frequencies may influence perceived valence. suggesting, latent semantics action verbs reflect parameters intensity emotional polarity appear correlated articulatory contrasts acoustic characteristics phonemes",4 "path planning kinematic constraints robot groups. path planning multiple robots well studied ai robotics communities. given discretized environment, robots need find collision-free paths set specified goal locations. robots fully anonymous, non-anonymous, organized groups. although powerful solvers abstract problem exist, make simplifying assumptions ignoring kinematic constraints, making difficult use resulting plans actual robots. paper, present solution takes kinematic constraints, maximum velocities, account, guaranteeing user-specified minimum safety distance robots. demonstrate approach simulation real robots 2d 3d environments.",4 "nonparametric inference auto-encoding variational bayes. would like learn latent representations low-dimensional highly interpretable. model characteristics gaussian process latent variable model. benefits negative gp-lvm complementary variational autoencoder, former provides interpretable low-dimensional latent representations latter able handle large amounts data use non-gaussian likelihoods. inspiration paper marry two approaches reap benefits both. order introduce novel approximate inference scheme inspired gp-lvm vae. show experimentally approximation allows capacity generative bottle-neck (z) vae arbitrarily large without losing highly interpretable representation, allowing reconstruction quality unlimited z time low-dimensional space used perform ancestral sampling well means reason embedded data.",19 "biometric authorization system using gait biometry. human gait, new biometric aimed recognize individuals way walk come play increasingly important role visual surveillance applications. paper novel hybrid holistic approach proposed show behavioural walking characteristics used recognize unauthorized suspicious persons enter surveillance area. initially background modelled input video captured cameras deployed security foreground moving object individual frames segmented using background subtraction algorithm. gait representing spatial, temporal wavelet components extracted fused training testing multi class support vector machine models (svm). proposed system evaluated using side view videos nlpr database. experimental results demonstrate proposed system achieves pleasing recognition rate also results indicate classification ability svm radial basis function (rbf) better kernel functions.",4 "active ranking pairwise comparisons parametric assumptions help. consider sequential active ranking set n items based noisy pairwise comparisons. items ranked according probability given item beats randomly chosen item, ranking refers partitioning items sets pre-specified sizes according scores. notion ranking includes special cases identification top-k items total ordering items. first analyze sequential ranking algorithm counts number comparisons won, uses counts decide whether stop, compare another pair items, chosen based confidence intervals specified data collected point. prove algorithm succeeds recovering ranking using number comparisons optimal logarithmic factors. guarantee require structural properties underlying pairwise probability matrix, unlike significant body past work pairwise ranking based parametric models thurstone bradley-terry-luce models. long-standing open question whether imposing parametric assumptions allows improved ranking algorithms. stochastic comparison models, pairwise probabilities bounded away zero, second contribution resolve issue proving lower bound parametric models. shows, perhaps surprisingly, popular parametric modeling choices offer logarithmic gains stochastic comparisons.",4 "knowledge management economic intelligence reasoning temporal attributes. people make important decisions within time frame. hence, imperative employ means strategy aid effective decision making. consequently, economic intelligence (ei) emerged field aid strategic timely decision making organization. course attaining goal: indispensable optimistic towards provision conservation intellectual resource invested process decision making. intellectual resource nothing else knowledge actors well various processes effecting decision making. knowledge recognized strategic economic resource enhancing productivity key innovation organization community. thus, adequate management cognizance temporal properties highly indispensable. temporal properties knowledge refer date time (known timestamp) knowledge created well duration interval related knowledge. paper focuses needs user-centered knowledge management approach well exploitation associated temporal properties. perspective knowledge respect decision-problems projects ei. hypothesis possibility reasoning temporal properties exploitation knowledge ei projects foster timely decision making generation useful inferences available reusable knowledge new project.",4 "ground truth bias external cluster validity indices. noticed external cvis exhibit preferential bias towards larger smaller number clusters monotonic (directly inversely) number clusters candidate partitions. type bias caused functional form cvi model. example, popular rand index (ri) exhibits monotone increasing (ncinc) bias, jaccard index (ji) index suffers monotone decreasing (ncdec) bias. type bias previously recognized literature. work, identify new type bias arising distribution ground truth (reference) partition candidate partitions compared. call new type bias ground truth (gt) bias. type bias occurs change reference partition causes change bias status (e.g., ncinc, ncdec) cvi. example, ncinc bias ri changed ncdec bias skewing distribution clusters ground truth partition. important users aware new type biased behaviour, since may affect interpretations cvi results. objective article study empirical theoretical implications gt bias. best knowledge, first extensive study property external cluster validity indices.",19 "analyzing language development network approach. paper propose new measures language development using network analyses, inspired recent surge interests network studies many real-world systems. children's care-takers' speech data longitudinal study represented series networks, word forms taken nodes collocation words links. measures properties networks, size, connectivity, hub authority analyses, etc., allow us make quantitative comparison reveal different paths development. example, asynchrony development network size average degree suggests children cannot simply classified early talkers late talkers one two measures. children follow different paths multi-dimensional space. may develop faster one dimension slower another dimension. network approach requires little preprocessing words analyses sentence structures, characteristics words usage emerge network independent grammatical presumptions. show change two articles ""the"" ""a"" roles important nodes network reflects progress children's syntactic development: two articles often start children's networks hubs later shift authorities, authorities constantly adult's networks. network analyses provide new approach study language development, time language development also presents rich area network theories explore.",4 "training spiking neural networks based information theoretic costs. spiking neural network type artificial neural network neurons communicate spikes. spikes identical boolean events characterized time arrival. spiking neuron internal dynamics responds history inputs opposed current inputs only. properties spiking neural network rich intrinsic capabilities process spatiotemporal data. however, spikes discontinuous 'yes no' events, trivial apply traditional training procedures gradient descend spiking neurons. thesis propose use stochastic spiking neuron models probability spiking output continuous function parameters. formulate several learning tasks minimization certain information-theoretic cost functions use spiking output probability distributions. develop generalized description stochastic spiking neuron new spiking neuron model allows flexibly process rich spatiotemporal data. formulate derive learning rules following tasks: - supervised learning task detecting spatiotemporal pattern minimization negative log-likelihood (the surprisal) neuron's output - unsupervised learning task increasing stability neurons output minimization entropy - reinforcement learning task controlling agent modulated optimization filtered surprisal neuron's output. test derived learning rules several experiments spatiotemporal pattern detection, spatiotemporal data storing recall autoassociative memory, combination supervised unsupervised learning speed learning process, adaptive control simple virtual agents changing environments.",4 "indian sign language recognition using eigen value weighted euclidean distance based classification technique. sign language recognition one growing fields research today. many new techniques developed recently fields. paper, proposed system using eigen value weighted euclidean distance classification technique recognition various sign languages india. system comprises four parts: skin filtering, hand cropping, feature extraction classification. twenty four signs considered paper, ten samples, thus total two hundred forty images considered recognition rate obtained 97 percent.",4 "biologically inspired protection deep networks adversarial attacks. inspired biophysical principles underlying nonlinear dendritic computation neural circuits, develop scheme train deep neural networks make robust adversarial attacks. scheme generates highly nonlinear, saturated neural networks achieve state art performance gradient based adversarial examples mnist, despite never exposed adversarially chosen examples training. moreover, networks exhibit unprecedented robustness targeted, iterative schemes generating adversarial examples, including second-order methods. identify principles governing networks achieve robustness, drawing methods information geometry. find networks progressively create highly flat compressed internal representations sensitive input dimensions, still solving task. moreover, employ highly kurtotic weight distributions, also found brain, demonstrate kurtosis protect even linear classifiers adversarial attack.",19 "unsupervised context-sensitive spelling correction english dutch clinical free-text word character n-gram embeddings. present unsupervised context-sensitive spelling correction method clinical free-text uses word character n-gram embeddings. method generates misspelling replacement candidates ranks according semantic fit, calculating weighted cosine similarity vectorized representation candidate misspelling context. tune parameters model, generate self-induced spelling error corpora. perform experiments two languages. english, greatly outperform off-the-shelf spelling correction tools manually annotated mimic-iii test set, counter frequency bias noisy channel model, showing neural embeddings successfully exploited improve upon state-of-the-art. dutch, also outperform off-the-shelf spelling correction tool manually annotated clinical records antwerp university hospital, offer empirical evidence method counters frequency bias noisy channel model case well. however, context-sensitive model implementation noisy channel model obtain high scores test set, establishing state-of-the-art dutch clinical spelling correction noisy channel model.",4 "automatic calcium scoring low-dose chest ct using deep neural networks dilated convolutions. heavy smokers undergoing screening low-dose chest ct affected cardiovascular disease much lung cancer. low-dose chest ct scans acquired screening enable quantification atherosclerotic calcifications thus enable identification subjects increased cardiovascular risk. paper presents method automatic detection coronary artery, thoracic aorta cardiac valve calcifications low-dose chest ct using two consecutive convolutional neural networks. first network identifies labels potential calcifications according anatomical location second network identifies true calcifications among detected candidates. method trained evaluated set 1744 ct scans national lung screening trial. determine whether reconstruction images reconstructed soft tissue filters used calcification detection, evaluated method soft medium/sharp filter reconstructions separately. soft filter reconstructions, method achieved f1 scores 0.89, 0.89, 0.67, 0.55 coronary artery, thoracic aorta, aortic valve mitral valve calcifications, respectively. sharp filter reconstructions, f1 scores 0.84, 0.81, 0.64, 0.66, respectively. linearly weighted kappa coefficients risk category assignment based per subject coronary artery calcium 0.91 0.90 soft sharp filter reconstructions, respectively. results demonstrate presented method enables reliable automatic cardiovascular risk assessment low-dose chest ct scans acquired lung cancer screening.",4 "thing tried work well : deictic representation reinforcement learning. reinforcement learning methods operate propositional representations world state. representations often intractably large generalize poorly. using deictic representation believed viable alternative: promise generalization allowing use existing reinforcement-learning methods. yet, experiments learning deictic representations reported literature. paper explore effectiveness two forms deictic representation na\""{i}ve propositional representation simple blocks-world domain. find, empirically, deictic representations actually worsen learning performance. conclude discussion possible causes results strategies effective learning domains objects.",4 "parsimonious topic models salient word discovery. propose parsimonious topic model text corpora. related models latent dirichlet allocation (lda), words modeled topic-specifically, even though many words occur similar frequencies across different topics. modeling determines salient words topic, topic-specific probabilities, rest explained universal shared model. further, lda topics principle present every document. contrast model gives sparse topic representation, determining (small) subset relevant topics document. derive bayesian information criterion (bic), balancing model complexity goodness fit. here, interestingly, identify effective sample size corresponding penalty specific parameter type model. minimize bic jointly determine entire model -- topic-specific words, document-specific topics, model parameter values, {\it and} total number topics -- wholly unsupervised fashion. results three text corpora image dataset show model achieves higher test set likelihood better agreement ground-truth class labels, compared lda model designed incorporate sparsity.",4 "exploiting causal independence bayesian network inference. new method proposed exploiting causal independencies exact bayesian network inference. bayesian network viewed representing factorization joint probability multiplication set conditional probabilities. present notion causal independence enables one factorize conditional probabilities combination even smaller factors consequently obtain finer-grain factorization joint probability. new formulation causal independence lets us specify conditional probability variable given parents terms associative commutative operator, ``or'', ``sum'' ``max'', contribution parent. start simple algorithm bayesian network inference that, given evidence query variable, uses factorization find posterior distribution query. show algorithm extended exploit causal independence. empirical studies, based cpcs networks medical diagnosis, show method efficient previous methods allows inference larger networks previous algorithms.",4 "bat algorithm better intermittent search strategy. efficiency metaheuristic algorithm largely depends way balancing local intensive exploitation global diverse exploration. studies show bat algorithm provide good balance two key components superior efficiency. paper, first review commonly used metaheuristic algorithms, compare performance bat algorithm so-called intermittent search strategy. simulations, found bat algorithm better optimal intermittent search strategy. also analyse comparison results implications higher dimensional optimization problems. addition, also apply bat algorithm solving business optimization engineering design problems.",12 "multi-engine approach answer set programming. answer set programming (asp) truly-declarative programming paradigm proposed area non-monotonic reasoning logic programming, recently employed many applications. development efficient asp systems is, thus, crucial. mind task improving solving methods asp, two usual ways reach goal: $(i)$ extending state-of-the-art techniques asp solvers, $(ii)$ designing new asp solver scratch. alternative trends build top state-of-the-art solvers, apply machine learning techniques choosing automatically ""best"" available solver per-instance basis. paper pursue latter direction. first define set cheap-to-compute syntactic features characterize several aspects asp programs. then, apply classification methods that, given features instances {\sl training} set solvers' performance instances, inductively learn algorithm selection strategies applied {\sl test} set. report results number experiments considering solvers different training test sets instances taken ones submitted ""system track"" 3rd asp competition. analysis shows that, applying machine learning techniques asp solving, possible obtain robust performance: approach solve instances compared solver entered 3rd asp competition. (to appear theory practice logic programming (tplp).)",4 "centroid-based summarization multiple documents: sentence extraction, utility-based evaluation, user studies. present multi-document summarizer, called mead, generates summaries using cluster centroids produced topic detection tracking system. also describe two new techniques, based sentence utility subsumption, applied evaluation single multiple document summaries. finally, describe two user studies test models multi-document summarization.",4 "fast deep learning model textual relevance biomedical information retrieval. publications life sciences characterized large technical vocabulary, many lexical semantic variations expressing concept. towards addressing problem relevance biomedical literature search, introduce deep learning model relevance document's text keyword style query. limited relatively small amount training data, model uses pre-trained word embeddings. these, model first computes variable-length delta matrix query document, representing difference two texts, passed deep convolution stage followed deep feed-forward network compute relevance score. results fast model suitable use online search engine. model robust outperforms comparable state-of-the-art deep learning approaches.",4 "deep residual bidir-lstm human activity recognition using wearable sensors. human activity recognition (har) become popular topic research wide application. development deep learning, new ideas appeared address har problems. here, deep network architecture using residual bidirectional long short-term memory (lstm) cells proposed. advantages new network include bidirectional connection concatenate positive time direction (forward state) negative time direction (backward state). second, residual connections stacked cells act highways gradients, pass underlying information directly upper layer, effectively avoiding gradient vanishing problem. generally, proposed network shows improvements temporal (using bidirectional cells) spatial (residual connections stacked deeply) dimensions, aiming enhance recognition rate. tested opportunity data set public domain uci data set, accuracy increased 4.78% 3.68%, respectively, compared previously reported results. finally, confusion matrix public domain uci data set analyzed.",4 "multi-label image recognition recurrently discovering attentional regions. paper proposes novel deep architecture address multi-label image recognition, fundamental practical task towards general visual understanding. current solutions task usually rely extra step extracting hypothesis regions (i.e., region proposals), resulting redundant computation sub-optimal performance. work, achieve interpretable contextualized multi-label image classification developing recurrent memorized-attention module. module consists two alternately performed components: i) spatial transformer layer locate attentional regions convolutional feature maps region-proposal-free way ii) lstm (long-short term memory) sub-network sequentially predict semantic labeling scores located regions capturing global dependencies regions. lstm also output parameters computing spatial transformer. large-scale benchmarks multi-label image classification (e.g., ms-coco pascal voc 07), approach demonstrates superior performances existing state-of-the-arts accuracy efficiency.",4 "adversarial dropout supervised semi-supervised learning. recently, training adversarial examples, generated adding small worst-case perturbation input examples, proved improve generalization performance neural networks. contrast individually biased inputs enhance generality, paper introduces adversarial dropout, minimal set dropouts maximize divergence outputs network dropouts training supervisions. identified adversarial dropout used reconfigure neural network train, demonstrated training reconfigured sub-network improves generalization performance supervised semi-supervised learning tasks mnist cifar-10. analyzed trained model reason performance improvement, found adversarial dropout increases sparsity neural networks standard dropout does.",4 "linguistics aspects underlying dynamics. recent years, central components new approach linguistics, minimalist program (mp) come closer physics. features minimalist program, unconstrained nature recursive merge, operation labeling algorithm operates interface narrow syntax conceptual-intentional sensory-motor interfaces, difference pronounced un-pronounced copies elements sentence build-up fibonacci sequence syntactic derivation sentence structures, directly accessible representation terms algebraic formalism. although scheme linguistic structures classical ones, find interesting productive isomorphism established mp structure, algebraic structures many-body field theory opening new avenues inquiry dynamics underlying central aspects linguistics.",4 "deepapt: nation-state apt attribution using end-to-end deep neural networks. recent years numerous advanced malware, aka advanced persistent threats (apt) allegedly developed nation-states. task attributing apt specific nation-state extremely challenging several reasons. nation-state usually single cyber unit develops advanced malware, rendering traditional authorship attribution algorithms useless. furthermore, apts use state-of-the-art evasion techniques, making feature extraction challenging. finally, dataset available apts extremely small. paper describe deep neural networks (dnn) could successfully employed nation-state apt attribution. use sandbox reports (recording behavior apt run dynamically) raw input neural network, allowing dnn learn high level feature abstractions apts itself. using test set 1,000 chinese russian developed apts, achieved accuracy rate 94.6%.",4 "deep unsupervised intrinsic image decomposition siamese training. harness modern intrinsic decomposition tools based deep learning increase applicability realworld use cases. traditional techniques derived retinex theory: handmade prior assumptions constrain optimization yield unique solution qualitatively satisfying limited set examples. modern techniques based supervised deep learning leverage largescale databases usually synthetic sparsely annotated. decomposition quality images wild therefore arguable. propose end-to-end deep learning solution trained without ground truth supervision, hard obtain. time-lapses form ubiquitous source data (under scene staticity assumption) capture constant albedo varying shading conditions. exploit natural relationship train unsupervised siamese manner image pairs. yet, trained network applies single images inference time. present new dataset demonstrate siamese training on, reach results compete state art, despite unsupervised nature training scheme. evaluation difficult, rely extensive experiments analyze strengths weaknesses related methods.",4 "cascaded region-based densely connected network event detection: seismic application. automatic event detection time series signals wide applications, abnormal event detection video surveillance event detection geophysical data. traditional detection methods detect events primarily use similarity correlation data. methods inefficient yield low accuracy. recent years, significantly increased computational power, machine learning techniques revolutionized many science engineering domains. study, apply deep-learning-based method detection events time series seismic signals. however, direct adaptation similar ideas 2d object detection problem faces two challenges. first challenge duration earthquake event varies significantly; proposals generated temporally correlated. address challenges, propose novel cascaded region-based convolutional neural network capture earthquake events different sizes, incorporating contextual information enrich features individual proposal. achieve better generalization performance, use densely connected blocks backbone network. fact positive events correctly annotated, formulate detection problem learning-from-noise problem. verify performance detection methods, employ methods seismic data generated bi-axial ""earthquake machine"" located rock mechanics laboratory, acquire labels help experts. numerical tests, show novel detection techniques yield high accuracy. therefore, novel deep-learning-based detection methods potentially powerful tools locating events time series data various applications.",4 "possible similarity gene semantic networks. several domains linguistics, molecular biology social sciences, holistic effects hardly well-defined modeling single units, studies tend understand macro structures help meaningful useful associations fields social networks, systems biology semantic web. stochastic multi-agent system offers accurate theoretical framework operational computing implementations model large-scale associations, dynamics patterns extraction. show clustering around target object set associations object prove similarity specific data two case studies gene-gene term-term relationships leading idea common organizing principle cognition random deterministic effects.",4 "generic multiplicative methods implementing machine learning algorithms mapreduce. paper introduce generic model multiplicative algorithms suitable mapreduce parallel programming paradigm. implement three typical machine learning algorithms demonstrate similarity comparison, gradient descent, power method classic learning techniques fit model well. two versions large-scale matrix multiplication discussed paper, different methods developed cases regard unique computational characteristics problem settings. contrast earlier research, focus fundamental linear algebra techniques establish generic approach range algorithms, rather specific ways scaling algorithms one time. experiments show promising results evaluated speedup accuracy. compared standard implementation computational complexity $o(m^3)$ worst case, large-scale matrix multiplication experiments prove design considerably efficient maintains good speedup number cores increases. algorithm-specific experiments also produce encouraging results runtime performance.",4 "spatially constrained location prior scene parsing. semantic context important useful cue scene parsing complicated natural images substantial amount variations objects environment. paper proposes spatially constrained location prior (sclp) effective modelling global local semantic context scene terms inter-class spatial relationships. unlike existing studies focusing either relative absolute location prior objects, sclp effectively incorporates relative absolute location priors calculating object co-occurrence frequencies spatially constrained image blocks. sclp general used conjunction various visual feature-based prediction models, artificial neural networks support vector machine (svm), enforce spatial contextual constraints class labels. using svm classifiers linear regression model, demonstrate incorporation sclp achieves superior performance compared state-of-the-art methods stanford background sift flow datasets.",4 "power joint wavelet-dct features multispectral palmprint recognition. biometric-based identification drawn lot attention recent years. among biometrics, palmprint known possess rich set features. paper proposed use dct-based features parallel wavelet-based ones palmprint identification. pca applied features reduce dimensionality majority voting algorithm used perform classification. features introduced result near-perfectly accurate identification. method tested well-known multispectral palmprint database accuracy rate 99.97-100\% achieved, outperforming previous methods similar conditions.",4 "multiphase image segmentation based fuzzy membership functions l1-norm fidelity. paper, propose variational multiphase image segmentation model based fuzzy membership functions l1-norm fidelity. apply alternating direction method multipliers solve equivalent problem. subproblems solved efficiently. specifically, propose fast method calculate fuzzy median. experimental results comparisons show l1-norm based method robust outliers impulse noise keeps better contrast l2-norm counterpart. theoretically, prove existence minimizer analyze convergence algorithm.",12 "chinese dataset negative full forms general abbreviation prediction. abbreviation common phenomenon across languages, especially chinese. cases, expression abbreviated, abbreviation used often fully expanded forms, since people tend convey information concise way. various language processing tasks, abbreviation obstacle improving performance, textual form abbreviation express useful information, unless expanded full form. abbreviation prediction means associating fully expanded forms abbreviations. however, due deficiency abbreviation corpora, task limited current studies, especially considering general abbreviation prediction also include full form expressions valid abbreviations, namely negative full forms (nffs). corpora incorporating negative full forms general abbreviation prediction number. order promote research area, build dataset general chinese abbreviation prediction, needs preprocessing steps, evaluate several different models built dataset. dataset available https://github.com/lancopku/chinese-abbreviation-dataset",4 "self-supervised learning motion capture. current state-of-the-art solutions motion capture single camera optimization driven: optimize parameters 3d human model re-projection matches measurements video (e.g. person segmentation, optical flow, keypoint detections etc.). optimization models susceptible local minima. bottleneck forced using clean green-screen like backgrounds capture time, manual initialization, switching multiple cameras input resource. work, propose learning based motion capture model single camera input. instead optimizing mesh skeleton parameters directly, model optimizes neural network weights predict 3d shape skeleton configurations given monocular rgb video. model trained using combination strong supervision synthetic data, self-supervision differentiable rendering (a) skeletal keypoints, (b) dense 3d mesh motion, (c) human-background segmentation, end-to-end framework. empirically show model combines best worlds supervised learning test-time optimization: supervised learning initializes model parameters right regime, ensuring good pose surface initialization test time, without manual effort. self-supervision back-propagating differentiable rendering allows (unsupervised) adaptation model test data, offers much tighter fit pretrained fixed model. show proposed model improves experience converges low-error solutions previous optimization methods fail.",4 "probabilistic prototype models attributed graphs. contribution proposes new approach towards developing class probabilistic methods classifying attributed graphs. key concept random attributed graph, defined attributed graph whose nodes edges annotated random variables. every node/edge two random processes associated it- occurence probability probability distribution attribute values. estimated within maximum likelihood framework. likelihood random attributed graph generate outcome graph used feature classification. proposed approach fast robust noise.",4 "simple hierarchical pooling data structure loop closure. propose data structure obtained hierarchically averaging bag-of-word descriptors sequence views achieves average speedups large-scale loop closure applications ranging 4 20 times benchmark datasets. although simple, method works well sophisticated agglomerative schemes fraction cost minimal loss performance.",4 "artificial agents speculative bubbles. pertaining agent-based computational economics (ace), work presents two models rise downfall speculative bubbles exchange price fixing based double auction mechanisms. first model based finite time horizon context, expected dividends decrease along time. second model follows {\em greater fool} hypothesis; agent behaviour depends comparison estimated risk greater fool's. simulations shed light influent parameters necessary conditions apparition speculative bubbles asset market within considered framework.",4 "unsupervised learning disentangled interpretable representations sequential data. present factorized hierarchical variational autoencoder, learns disentangled interpretable representations sequential data without supervision. specifically, exploit multi-scale nature information sequential data formulating explicitly within factorized hierarchical graphical model imposes sequence-dependent priors sequence-independent priors different sets latent variables. model evaluated two speech corpora demonstrate, qualitatively, ability transform speakers linguistic content manipulating different sets latent variables; quantitatively, ability outperform i-vector baseline speaker verification reduce word error rate much 35% mismatched train/test scenarios automatic speech recognition tasks.",4 "predictive linear-gaussian models stochastic dynamical systems. models dynamical systems based predictive state representations (psrs) defined strictly terms observable quantities, contrast traditional models (such hidden markov models) use latent variables statespace representations. addition, psrs effectively infinite memory, allowing model systems finite memory-based models cannot. thus far, psr models primarily developed domains discrete observations. here, develop predictive linear-gaussian (plg) model, class psr models domains continuous observations. show plg models subsume linear dynamical system models (also called kalman filter models state-space models) using fewer parameters. also introduce algorithm estimate plg parameters data, contrast standard expectation maximization (em) algorithms used estimate kalman filter parameters. show algorithm consistent estimation procedure present preliminary empirical results suggesting algorithm outperforms em, particularly model dimension increases.",4 "boolean matrix factorization noisy completion via message passing. boolean matrix factorization boolean matrix completion noisy observations desirable unsupervised data-analysis methods due interpretability, hard perform due np-hardness. treat problems maximum posteriori inference problems graphical model present message passing approach scales linearly number observations factors. empirical study demonstrates message passing able recover low-rank boolean matrices, boundaries theoretically possible recovery compares favorably state-of-the-art real-world applications, collaborative filtering large-scale boolean data.",12 "accurate tests statistical significance result differences. statistical significance testing differences values metrics like recall, precision balanced f-score necessary part empirical natural language processing. unfortunately, find set experiments many commonly used tests often underestimate significance less likely detect differences exist different techniques. underestimation comes independence assumption often violated. point useful tests make assumption, including computationally-intensive randomization tests.",4 "core kernels. term ""core kernel"" stands correlation-resemblance kernel. many applications (e.g., vision), data often high-dimensional, sparse, non-binary. propose two types (nonlinear) core kernels non-binary sparse data demonstrate effectiveness new kernels classification experiment. core kernels simple tuning parameters. however, training nonlinear kernel svm (very) costly time memory may suitable truly large-scale industrial applications (e.g. search). order make proposed core kernels practical, develop basic probabilistic hashing algorithms transform nonlinear kernels linear kernels.",19 "belief propagation linear programming. belief propagation (bp) popular, distributed heuristic performing map computations graphical models. bp interpreted, variational perspective, minimizing bethe free energy (bfe). bp also used solve special class linear programming (lp) problems. class problems, map inference stated integer lp lp relaxation coincides minimization bfe ``zero temperature"". generalize prior results establish tight characterization lp problems formulated equivalent lp relaxation map inference. moreover, suggest efficient, iterative annealing bp algorithm solving broader class lp problems. demonstrate algorithm's performance set weighted matching problems using cutting plane method solve sequence lps tightened adding ``blossom'' inequalities.",4 "nearly optimal robust subspace tracking dynamic robust pca. study robust subspace tracking (rst) problem obtain one first provable guarantees it. goal rst track data lies slowly changing low-dimensional subspace, subspaces themselves, robust corruption (often large magnitude) sparse outliers. simply interpreted dynamic (time-varying) extension robust pca, minor difference rst also requires online algorithm (short tracking delay). propose algorithm called norst (nearly optimal rst) prove solves rst dynamic robust pca weakened versions standard rpca assumptions, slow subspace change, two simple extra assumptions (a lower bound outlier magnitudes, independence columns low-rank matrix). guarantee shows norst enjoys near optimal tracking delay $o(r \log n \log(1/\varepsilon))$. required delay subspace change times same, memory complexity $n$ times value. $n$ ambient space dimension $r$ dimension changing subspaces true data lies. thus also nearly optimal. finally, guarantee also shows norst best outlier tolerance compared previous rpca rst methods, theoretically empirically (including real videos), without requiring model outlier support sets.",4 "using wikipedia boost svd recommender systems. singular value decomposition (svd) used successfully recent years area recommender systems. paper present model extended consider user ratings information wikipedia. mapping items wikipedia pages quantifying similarity, able use information order improve recommendation accuracy, especially sparsity high. another advantage proposed approach fact easily integrated svd implementation, regardless additional parameters may added it. preliminary experimental results movielens dataset encouraging.",4 "another perspective default reasoning. lexicographic closure given finite set normal defaults defined. conditional assertion ""if b"" lexicographic closure if, given defaults fact a, one would conclude b. lexicographic closure essentially rational extension d, rational closure, defined previous paper. provides logic normal defaults different one proposed r. reiter rich enough require consideration non-normal defaults. large number examples provided show lexicographic closure corresponds basic intuitions behind reiter's logic defaults.",4 "iteration complexity randomized block-coordinate descent methods minimizing composite function. paper develop randomized block-coordinate descent method minimizing sum smooth simple nonsmooth block-separable convex function prove obtains $\epsilon$-accurate solution probability least $1-\rho$ $o(\tfrac{n}{\epsilon} \log \tfrac{1}{\rho})$ iterations, $n$ number blocks. strongly convex functions method converges linearly. extends recent results nesterov [efficiency coordinate descent methods huge-scale optimization problems, core discussion paper #2010/2], cover smooth case, composite minimization, time improving complexity factor 4 removing $\epsilon$ logarithmic term. importantly, contrast aforementioned work author achieves results applying method regularized version objective function unknown scaling factor, show necessary, thus achieving true iteration complexity bounds. smooth case also allow arbitrary probability vectors non-euclidean norms. finally, demonstrate numerically algorithm able solve huge-scale $\ell_1$-regularized least squares support vector machine problems billion variables.",12 "online learning-based framework tracking. study tracking problem, namely, estimating hidden state object time, unreliable noisy measurements. standard framework tracking problem generative framework, basis solutions bayesian algorithm approximation, particle filters. however, solutions sensitive model mismatches. paper, motivated online learning, introduce new framework tracking. provide efficient tracking algorithm framework. provide experimental results comparing algorithm bayesian algorithm simulated data. experiments show slight model mismatches, algorithm outperforms bayesian algorithm.",4 "conflict-driven asp solving external sources. answer set programming (asp) well-known problem solving approach based nonmonotonic logic programs efficient solvers. enable access external information, hex-programs extend programs external atoms, allow bidirectional communication logic program external sources computation (e.g., description logic reasoners web resources). current solvers evaluate hex-programs translation asp itself, values external atoms guessed verified ordinary answer set computation. elegant approach scale number external accesses general, particular presence nondeterminism (which instrumental asp). paper, present novel, native algorithm evaluating hex-programs uses learning techniques. particular, extend conflict-driven asp solving techniques, prevent solver running conflict again, ordinary hex-programs. show gain additional knowledge external source evaluations use conflict-driven algorithm. first target uninformed case, i.e., extra information external sources, extend approach case additional meta-information available. experiments show learning external sources significantly decrease runtime number considered candidate compatible sets.",4 "audio-replay attack detection countermeasures. paper presents speech technology center (stc) replay attack detection systems proposed automatic speaker verification spoofing countermeasures challenge 2017. study focused comparison different spoofing detection approaches. gmm based methods, high level features extraction simple classifier deep learning frameworks. experiments performed development evaluation parts challenge dataset demonstrated stable efficiency deep learning approaches case changing acoustic conditions. time svm classifier high level features provided substantial input efficiency resulting stc systems according fusion systems results.",4 "automata networks multi-party communication naming game. naming game studied explore role self-organization development negotiation linguistic conventions. paper, define automata networks approach naming game. two problems faced: (1) definition automata networks multi-party communicative interactions; (2) proof convergence three different orders individuals updated (updating schemes). finally, computer simulations explored two-dimensional lattices purpose recover main features naming game describe dynamics different updating schemes.",4 "post-hoc labeling arbitrary eeg recordings data-efficient evaluation neural decoding methods. many cognitive, sensory motor processes correlates oscillatory neural sources, embedded subspace recorded brain signals. decoding processes noisy magnetoencephalogram/electroencephalogram (m/eeg) signals usually requires use data-driven analysis methods. objective evaluation decoding algorithms experimental raw signals, however, challenge: amount available m/eeg data typically limited, labels unreliable, raw signals often contaminated artifacts. latter specifically problematic, artifacts stem behavioral confounds oscillatory neural processes interest. overcome problems, simulation frameworks introduced benchmarking decoding methods. generating artificial brain signals, however, simulation frameworks make strong partially unrealistic assumptions brain activity, limits generalization obtained results real-world conditions. present contribution, thrive remove many shortcomings current simulation frameworks propose versatile alternative, allows objective evaluation benchmarking novel data-driven decoding methods neural signals. central idea utilize post-hoc labelings arbitrary m/eeg recordings. strategy makes paradigm-agnostic allows generate comparatively large datasets noiseless labels. source code data novel simulation approach made available facilitating adoption.",4 "integrating cardinal direction relations orientation relations qualitative spatial reasoning. propose calculus integrating two calculi well-known qualitative spatial reasoning (qsr): frank's projection-based cardinal direction calculus, coarser version freksa's relative orientation calculus. original constraint propagation procedure presented, implements interaction two integrated calculi. importance taking account interaction shown real example providing inconsistent knowledge base, whose inconsistency (a) cannot detected reasoning separately two components knowledge, because, taken separately, consistent, (b) detected proposed algorithm, thanks interaction knowledge propagated two compnents other.",4 "ppmf: patient-based predictive modeling framework early icu mortality prediction. date, developing good model early intensive care unit (icu) mortality prediction still challenging. paper presents patient based predictive modeling framework (ppmf) improve performance icu mortality prediction using data collected first 48 hours icu admission. ppmf consists three main components verifying three related research hypotheses. first component captures dynamic changes patients status icu using time series data (e.g., vital signs laboratory tests). second component local approximation algorithm classifies patients based similarities. third component gradient decent wrapper updates feature weights according classification feedback. experiments using data mimiciii show ppmf significantly outperforms: (1) severity score systems, namely sasp iii, apache iv, mpm0iii, (2) aggregation based classifiers utilize summarized time series, (3) baseline feature selection methods.",4 "improved anomaly detection crowded scenes via cell-based analysis foreground speed, size texture. robust efficient anomaly detection technique proposed, capable dealing crowded scenes traditional tracking based approaches tend fail. initial foreground segmentation input frames confines analysis foreground objects effectively ignores irrelevant background dynamics. input frames split non-overlapping cells, followed extracting features based motion, size texture cell. feature type independently analysed presence anomaly. unlike methods, refined estimate object motion achieved computing optical flow foreground pixels. motion size features modelled approximated version kernel density estimation, computationally efficient even large training datasets. texture features modelled adaptively grown codebook, number entries codebook selected online fashion. experiments recently published ucsd anomaly detection dataset show proposed method obtains considerably better results three recent approaches: mppca, social force, mixture dynamic textures (mdt). proposed method also several orders magnitude faster mdt, next best performing method.",4 "learning deep features one-class classification. propose deep learning-based solution problem feature learning one-class classification. proposed method operates top convolutional neural network (cnn) choice produces descriptive features maintaining low intra-class variance feature space given class. purpose two loss functions, compactness loss descriptiveness loss proposed along parallel cnn architecture. template matching-based framework introduced facilitate testing process. extensive experiments publicly available anomaly detection, novelty detection mobile active authentication datasets show proposed deep one-class (doc) classification method achieves significant improvements state-of-the-art.",4 "online deforestation detection. deforestation detection using satellite images make important contribution forest management. current approaches broadly divided compare two images taken similar periods year monitor changes using multiple images taken growing season. cmfda algorithm described zhu et al. (2012) algorithm builds latter category implementing year-long, continuous, time-series based approach monitoring images. algorithm developed 30m resolution, 16-day frequency reflectance data landsat satellite. work adapt algorithm 1km, 16-day frequency reflectance data modis sensor aboard terra satellite. cmfda algorithm composed two submodels fitted pixel-by-pixel basis. first estimates amount surface reflectance function day year. second estimates occurrence deforestation event comparing last predicted real reflectance values. comparison, reflectance observations six different bands first combined forest index. real predicted values forest index compared high absolute differences consecutive observation dates flagged deforestation events. adapted algorithm also uses two model framework. however, since modis 13a2 dataset used, includes reflectance data different spectral bands included landsat dataset, cannot construct forest index. instead propose two contrasting approaches: multivariate index approach similar cmfda.",19 "microstructure reconstruction using entropic descriptors. multi-scale approach inverse reconstruction pattern's microstructure reported. instead correlation function, pair entropic descriptors (eds) proposed stochastic optimization method. first measures spatial inhomogeneity, binary pattern, compositional one, greyscale image. second one quantifies spatial compositional statistical complexity. eds reveal structural information dissimilar, least part, given correlation functions almost discrete length scales. method tested digitized binary greyscale images. cases, persuasive reconstruction microstructure found.",3 "multichannel variable-size convolution sentence classification. propose mvcnn, convolution neural network (cnn) architecture sentence classification. (i) combines diverse versions pretrained word embeddings (ii) extracts features multigranular phrases variable-size convolution filters. also show pretraining mvcnn critical good performance. mvcnn achieves state-of-the-art performance four tasks: small-scale binary, small-scale multi-class largescale twitter sentiment prediction subjectivity classification.",4 "use dempster-shafer conflict metric detect interpretation inconsistency. model world built sensor data may incorrect even sensors functioning correctly. possible causes include use inappropriate sensors (e.g. laser looking glass walls), sensor inaccuracies accumulate (e.g. localization errors), priori models wrong, internal representation match world (e.g. static occupancy grid used dynamically moving objects). interested case constructed model world flawed, access ground truth would allow system see discrepancy, robot entering unknown environment. paper considers problem determining something wrong using sensor data used construct world model. proposes 11 interpretation inconsistency indicators based dempster-shafer conflict metric, con, evaluates indicators according three criteria: ability distinguish true inconsistency sensor noise (classification), estimate magnitude discrepancies (estimation), determine source(s) (if any) sensing problems environment (isolation). evaluation conducted using data mobile robot sonar laser range sensors navigating indoor environments controlled conditions. evaluation shows gambino indicator performed best terms estimation (at best 0.77 correlation), isolation, classification sensing situation degraded (7% false negative rate) normal (0% false positive rate).",4 "image retrieval fisher vectors binary features. recently, fisher vector representation local features attracted much attention effectiveness image classification image retrieval. another trend area image retrieval use binary features orb, freak, brisk. considering significant performance improvement accuracy image classification retrieval fisher vector continuous feature descriptors, fisher vector also applied binary features, would receive similar benefits binary feature based image retrieval classification. paper, derive closed-form approximation fisher vector binary features modeled bernoulli mixture model. also propose accelerating fisher vector using approximate value posterior probability. experiments show fisher vector representation significantly improves accuracy image retrieval compared bag binary words approach.",4 "sever: robust meta-algorithm stochastic optimization. high dimensions, machine learning methods brittle even small fraction structured outliers. address this, introduce new meta-algorithm take base learner least squares stochastic gradient descent, harden learner resistant outliers. method, sever, possesses strong theoretical guarantees yet also highly scalable -- beyond running base learner itself, requires computing top singular vector certain $n \times d$ matrix. apply sever drug design dataset spam classification dataset, find cases substantially greater robustness several baselines. spam dataset, $1\%$ corruptions, achieved $7.4\%$ test error, compared $13.4\%-20.5\%$ baselines, $3\%$ error uncorrupted dataset. similarly, drug design dataset, $10\%$ corruptions, achieved $1.42$ mean-squared error test error, compared $1.51$-$2.33$ baselines, $1.23$ error uncorrupted dataset.",4 "cryptocurrency portfolio management deep reinforcement learning. portfolio management decision-making process allocating amount fund different financial investment products. cryptocurrencies electronic decentralized alternatives government-issued money, bitcoin best-known example cryptocurrency. paper presents model-less convolutional neural network historic prices set financial assets input, outputting portfolio weights set. network trained 0.7 years' price data cryptocurrency exchange. training done reinforcement manner, maximizing accumulative return, regarded reward function network. backtest trading experiments trading period 30 minutes conducted market, achieving 10-fold returns 1.8 months' periods. recently published portfolio selection strategies also used perform back-tests, whose results compared neural network. network limited cryptocurrency, applied financial markets.",4 "bayesian uncertainty estimation batch normalized deep networks. deep neural networks led series breakthroughs, dramatically improving state-of-the-art many domains. techniques driving advances, however, lack formal method account model uncertainty. bayesian approach learning provides solid theoretical framework handle uncertainty, inference bayesian-inspired deep neural networks difficult. paper, provide practical approach bayesian learning relies regularization technique found nearly every modern network, \textit{batch normalization}. show training deep network using batch normalization equivalent approximate inference bayesian models, demonstrate finding allows us make useful estimates model uncertainty. approach, possible make meaningful uncertainty estimates using conventional architectures without modifying network training procedure. approach thoroughly validated series empirical experiments different tasks using various measures, outperforming baselines strong statistical significance displaying competitive performance recent bayesian approaches.",19 "physics-guided neural networks (pgnn): application lake temperature modeling. paper introduces novel framework combining scientific knowledge physics-based models neural networks advance scientific discovery. framework, termed physics-guided neural network (pgnn), leverages output physics-based model simulations along observational features generate predictions using neural network architecture. further, paper presents novel framework using physics-based loss functions learning objective neural networks, ensure model predictions show lower errors training set also scientifically consistent known physics unlabeled set. illustrate effectiveness pgnn problem lake temperature modeling, physical relationships temperature, density, depth water used design physics-based loss function. using scientific knowledge guide construction learning neural networks, able show proposed framework ensures better generalizability well scientific consistency results.",4 "new solution relative orientation problem using 3 points vertical direction. paper presents new method recover relative pose two images, using three points vertical direction information. vertical direction determined two ways: 1- using direct physical measurement like imu (inertial measurement unit), 2- using vertical vanishing point. knowledge vertical direction solves 2 unknowns among 3 parameters relative rotation, 3 homologous points requested position couple images. rewriting coplanarity equations leads simpler solution. remaining unknowns resolution performed algebraic method using grobner bases. elements necessary build specific algebraic solver given paper, allowing real-time implementation. results real synthetic data show efficiency method.",4 "optimal algorithm bandit zero-order convex optimization two-point feedback. consider closely related problems bandit convex optimization two-point feedback, zero-order stochastic convex optimization two function evaluations per round. provide simple algorithm analysis optimal convex lipschitz functions. improves \cite{dujww13}, provides optimal result smooth functions; moreover, algorithm analysis simpler, readily extend non-euclidean problems. algorithm based small surprisingly powerful modification gradient estimator.",4 "design statistical quality control procedures using genetic algorithms. general, use algebraic enumerative methods optimize quality control (qc) procedure detect critical random systematic analytical errors stated probabilities, probability false rejection minimum. genetic algorithms (gas) offer alternative, require knowledge objective function optimized search large parameter spaces quickly. explore application gas statistical qc, developed interactive gas based computer program designs novel near optimal qc procedure, given analytical process. program uses deterministic crowding algorithm. illustrative application program suggests potential design qc procedures significantly better 45 alternative ones used clinical laboratories.",4 "opportunistic adaptation knowledge discovery. adaptation long considered achilles' heel case-based reasoning since requires domain-specific knowledge difficult acquire. paper, two strategies combined order reduce knowledge engineering cost induced adaptation knowledge (ca) acquisition task: ca learned case base means knowledge discovery techniques, ca acquisition sessions opportunistically triggered, i.e., problem-solving time.",4 "bayesian test significance conditional independence: multinomial model. conditional independence tests (ci tests) received special attention lately machine learning computational intelligence related literature important indicator relationship among variables used models. field probabilistic graphical models (pgm)--which includes bayesian networks (bn) models--ci tests especially important task learning pgm structure data. paper, propose full bayesian significance test (fbst) tests conditional independence discrete datasets. fbst powerful bayesian test precise hypothesis, alternative frequentist's significance tests (characterized calculation \emph{p-value}).",19 "model shrinkage effect gamma process edge partition models. edge partition model (epm) fundamental bayesian nonparametric model extracting overlapping structure binary matrix. epm adopts gamma process ($\gamma$p) prior automatically shrink number active atoms. however, empirically found model shrinkage epm typically work appropriately leads overfitted solution. analysis expectation epm's intensity function suggested gamma priors epm hyperparameters disturb model shrinkage effect internal $\gamma$p. order ensure model shrinkage effect epm works appropriate manner, proposed two novel generative constructions epm: cepm incorporating constrained gamma priors, depm incorporating dirichlet priors instead gamma priors. furthermore, depm's model parameters including infinite atoms $\gamma$p prior could marginalized out, thus possible derive truly infinite depm (idepm) efficiently inferred using collapsed gibbs sampler. experimentally confirmed model shrinkage proposed models works well idepm indicated state-of-the-art performance generalization ability, link prediction accuracy, mixing efficiency, convergence speed.",19 "detection resolution rumours social media: survey. despite increasing use social media platforms information news gathering, unmoderated nature often leads emergence spread rumours, i.e. pieces information unverified time posting. time, openness social media platforms provides opportunities study users share discuss rumours, explore natural language processing data mining techniques may used find ways determining veracity. survey introduce discuss two types rumours circulate social media; long-standing rumours circulate long periods time, newly-emerging rumours spawned fast-paced events breaking news, reports released piecemeal often unverified status early stages. provide overview research social media rumours ultimate goal developing rumour classification system consists four components: rumour detection, rumour tracking, rumour stance classification rumour veracity classification. delve approaches presented scientific literature development four components. summarise efforts achievements far towards development rumour classification systems conclude suggestions avenues future research social media mining detection resolution rumours.",4 "study unsupervised adaptive crowdsourcing. consider unsupervised crowdsourcing performance based model wherein responses end-users essentially rated according responses correlate majority responses subtasks/questions. one setting, consider independent sequence identically distributed crowdsourcing assignments (meta-tasks), consider single assignment large number component subtasks. problems yield intuitive results overall reliability crowd factor.",4 "group symmetry non-gaussian covariance estimation. consider robust covariance estimation group symmetry constraints. non-gaussian covariance estimation, e.g., tyler scatter estimator multivariate generalized gaussian distribution methods, usually involve non-convex minimization problems. recently, shown underlying principle behind success extended form convexity geodesics manifold positive definite matrices. modern approach improve estimation accuracy exploit prior knowledge via additional constraints, e.g., restricting attention specific classes covariances adhere prior symmetry structures. paper, prove group symmetry constraints also geodesically convex therefore incorporated various non-gaussian covariance estimators. practical examples sets include: circulant, persymmetric complex/quaternion proper structures. provide simple numerical technique finding maximum likelihood estimates constraints, demonstrate performance advantage using synthetic experiments.",19 "learning, investments derivatives. recent crisis following flight simplicity put derivative businesses around world considerable pressure. argue traditional modeling techniques must extended include product design. propose quantitative framework creating products meet challenge optimal investors point view remaining relatively simple transparent.",17 "rewriting constraint models metamodels. important challenge constraint programming rewrite constraint models executable programs calculat- ing solutions. phase constraint processing may require translations constraint programming lan- guages, transformations constraint representations, model optimizations, tuning solving strategies. paper, introduce pivot metamodel describing common fea- tures constraint models including different kinds con- straints, statements like conditionals loops, first-class elements like object classes predicates. metamodel general enough cope constructions many languages, object-oriented modeling languages logic languages, independent them. rewriting operations manipulate metamodel instances apart languages. consequence, rewriting operations apply whatever languages selected able manage model semantic information. bridge created metamodel space languages using parsing techniques. tools software engineering world useful implement framework.",4 "curious robot: learning visual representations via physical interactions. right supervisory signal train visual representations? current approaches computer vision use category labels datasets imagenet train convnets. however, case biological agents, visual representation learning require millions semantic labels. argue biological agents use physical interactions world learn visual representations unlike current vision systems use passive observations (images videos downloaded web). example, babies push objects, poke them, put mouth throw learn representations. towards goal, build one first systems baxter platform pushes, pokes, grasps observes objects tabletop environment. uses four different types physical interactions collect 130k datapoints, datapoint providing supervision shared convnet architecture allowing us learn visual representations. show quality learned representations observing neuron activations performing nearest neighbor retrieval learned representation. quantitatively, evaluate learned convnet image classification tasks show improvements compared learning without external data. finally, task instance retrieval, network outperforms imagenet network recall@1 3%",4 "deep multimodal semantic embeddings speech images. paper, present model takes input corpus images relevant spoken captions finds correspondence two modalities. employ pair convolutional neural networks model visual objects speech signals word level, tie networks together embedding alignment model learns joint semantic space modalities. evaluate model using image search annotation tasks flickr8k dataset, augmented collecting corpus 40,000 spoken captions using amazon mechanical turk.",4 "using english pivot extract persian-italian parallel sentences non-parallel corpora. effectiveness statistical machine translation system (smt) dependent upon amount parallel corpus used training phase. low-resource language pairs enough parallel corpora build accurate smt. paper, novel approach presented extract bilingual persian-italian parallel sentences non-parallel (comparable) corpus. study, english used pivot language compute matching scores source target sentences candidate selection phase. additionally, new monolingual sentence similarity metric, normalized google distance (ngd) proposed improve matching process. moreover, extensions baseline system applied improve quality extracted sentences measured bleu. experimental results show using new pivot based extraction increase quality bilingual corpus significantly consequently improves performance persian-italian smt system.",4 "learning repeat: fine grained action repetition deep reinforcement learning. reinforcement learning algorithms learn complex behavioral patterns sequential decision making tasks wherein agent interacts environment acquires feedback form rewards sampled it. traditionally, algorithms make decisions, i.e., select actions execute, every single time step agent-environment interactions. paper, propose novel framework, fine grained action repetition (figar), enables agent decide action well time scale repeating it. figar used improving deep reinforcement learning algorithm maintains explicit policy estimate enabling temporal abstractions action space. empirically demonstrate efficacy framework showing performance improvements top three policy search algorithms different domains: asynchronous advantage actor critic atari 2600 domain, trust region policy optimization mujoco domain deep deterministic policy gradients torcs car racing domain.",4 "trainable neuromorphic integrated circuit exploits device mismatch. random device mismatch arises result scaling cmos (complementary metal-oxide semi-conductor) technology deep submicron regime degrades accuracy analogue circuits. methods combat increase complexity design. developed novel neuromorphic system called trainable analogue block (tab), exploits device mismatch means random projections input higher dimensional space. tab framework inspired principles neural population coding operating biological nervous system. three neuronal layers, namely input, hidden, output, constitute tab framework, number hidden layer neurons far exceeding input layer neurons. here, present measurement results first prototype tab chip built using 65nm process technology show learning capability various regression tasks. tab chip exploits inherent randomness variability arising due fabrication process perform various learning tasks. additionally, characterise neuron discuss statistical variability tuning curve arises due random device mismatch, desirable property learning capability tab. also discuss effect number hidden neurons resolution output weights accuracy learning capability tab.",4 "temporal tessellation: unified approach video analysis. present general approach video understanding, inspired semantic transfer techniques successfully used 2d image analysis. method considers video 1d sequence clips, one associated semantics. nature semantics -- natural language captions labels -- depends task hand. test video processed forming correspondences clips clips reference videos known semantics, following which, reference semantics transferred test video. describe two matching methods, designed ensure (a) reference clips appear similar test clips (b), taken together, semantics selected reference clips consistent maintains temporal coherence. use method video captioning lsmdc'16 benchmark, video summarization summe tvsum benchmarks, temporal action detection thumos2014 benchmark, sound prediction greatest hits benchmark. method surpasses state art, four five benchmarks, importantly, single method know successfully applied diverse range tasks.",4 "listen, interact talk: learning speak via interaction. one long-term goals artificial intelligence build agent communicate intelligently human natural language. existing work natural language learning relies heavily training pre-collected dataset annotated labels, leading agent essentially captures statistics fixed external training data. training data essentially static snapshot representation knowledge annotator, agent trained way limited adaptiveness generalization behavior. moreover, different language learning process humans, language acquired communication taking speaking action learning consequences speaking action interactive manner. paper presents interactive setting grounded natural language learning, agent learns natural language interacting teacher learning feedback, thus learning improving language skills taking part conversation. achieve goal, propose model incorporates imitation reinforcement leveraging jointly sentence reward feedbacks teacher. experiments conducted validate effectiveness proposed approach.",4 "combining recurrent convolutional neural networks relation classification. paper investigates two different neural architectures task relation classification: convolutional neural networks recurrent neural networks. models, demonstrate effect different architectural choices. present new context representation convolutional neural networks relation classification (extended middle context). furthermore, propose connectionist bi-directional recurrent neural networks introduce ranking loss optimization. finally, show combining convolutional recurrent neural networks using simple voting scheme accurate enough improve results. neural models achieve state-of-the-art results semeval 2010 relation classification task.",4 "hilbert space methods reduced-rank gaussian process regression. paper proposes novel scheme reduced-rank gaussian process regression. method based approximate series expansion covariance function terms eigenfunction expansion laplace operator compact subset $\mathbb{r}^d$. approximate eigenbasis eigenvalues covariance function expressed simple functions spectral density gaussian process, allows gp inference solved computational cost scaling $\mathcal{o}(nm^2)$ (initial) $\mathcal{o}(m^3)$ (hyperparameter learning) $m$ basis functions $n$ data points. approach also allows rigorous error analysis hilbert space theory, show approximation becomes exact size compact subset number eigenfunctions go infinity. expansion generalizes hilbert spaces inner product defined integral specified input density. method compared previously proposed methods theoretically empirical tests simulated real data.",19 "additive model view sparse gaussian process classifier design. consider problem designing sparse gaussian process classifier (sgpc) generalizes well. viewing sgpc design constructing additive model like boosting, present efficient effective sgpc design method perform stage-wise optimization predictive loss function. introduce new methods two key components viz., site parameter estimation basis vector selection sgpc design. proposed adaptive sampling based basis vector selection method aids achieving improved generalization performance reduced computational cost. method also used conjunction site parameter estimation methods. similar computational storage complexities well-known information vector machine suitable large datasets. hyperparameters determined optimizing predictive loss function. experimental results show better generalization performance proposed basis vector selection method several benchmark datasets, particularly relatively smaller basis vector set sizes difficult datasets.",4 "evaluating link-based techniques detecting fake pharmacy websites. fake online pharmacies become increasingly pervasive, constituting 90% online pharmacy websites. need fake website detection techniques capable identifying fake online pharmacy websites high degree accuracy. study, compared several well-known link-based detection techniques large-scale test bed hyperlink graph encompassing 80 million links 15.5 million web pages, including 1.2 million known legitimate fake pharmacy pages. found qoc qol class propagation algorithms achieved accuracy 90% dataset. results revealed algorithms incorporate dual class propagation well inlink outlink information, page-level site-level graphs, better suited detecting fake pharmacy websites. addition, site-level analysis yielded significantly better results page-level analysis algorithms evaluated.",4 "path algorithm fused lasso signal approximator. lasso well known penalized regression model, adds $l_{1}$ penalty parameter $\lambda_{1}$ coefficients squared error loss function. fused lasso extends model also putting $l_{1}$ penalty parameter $\lambda_{2}$ difference neighboring coefficients, assuming natural ordering. paper, develop fast path algorithm solving fused lasso signal approximator computes solutions values $\lambda_1$ $\lambda_2$. supplement, also give algorithm general fused lasso case predictor matrix $\bx \in \mathds{r}^{n \times p}$ $\text{rank}(\bx)=p$.",19 "narrativeqa reading comprehension challenge. reading comprehension (rc)---in contrast information retrieval---requires integrating information reasoning events, entities, relations across full document. question answering conventionally used assess rc ability, artificial agents children learning read. however, existing rc datasets tasks dominated questions solved selecting answers using superficial information (e.g., local context similarity global term frequency); thus fail test essential integrative aspect rc. encourage progress deeper comprehension language, present new dataset set tasks reader must answer questions stories reading entire books movie scripts. tasks designed successfully answering questions requires understanding underlying narrative rather relying shallow pattern matching salience. show although humans solve tasks easily, standard rc models struggle tasks presented here. provide analysis dataset challenges presents.",4 "uniform deviation bounds unbounded loss functions like k-means. uniform deviation bounds limit difference model's expected loss loss empirical sample uniformly models learning problem. such, critical component empirical risk minimization. paper, provide novel framework obtain uniform deviation bounds loss functions *unbounded*. main application, allows us obtain bounds $k$-means clustering weak assumptions underlying distribution. fourth moment bounded, prove rate $\mathcal{o}\left(m^{-\frac12}\right)$ compared previously known $\mathcal{o}\left(m^{-\frac14}\right)$ rate. furthermore, show rate also depends kurtosis - normalized fourth moment measures ""tailedness"" distribution. provide improved rates progressively stronger assumptions, namely, bounded higher moments, subgaussianity bounded support.",19 "constraint-satisfaction parser context-free grammars. traditional language processing tools constrain language designers specific kinds grammars. contrast, model-based language specification decouples language design language processing. consequence, model-based language specification tools need general parsers able parse unrestricted context-free grammars. languages specified following approach may ambiguous, parsers must deal ambiguities. model-based language specification also allows definition associativity, precedence, custom constraints. therefore parsers generated model-driven language specification tools need enforce constraints. paper, propose fence, efficient bottom-up chart parser lexical syntactic ambiguity support allows specification constraints and, therefore, enables use model-based language specification practice.",4 "planning graph (dynamic) csp: exploiting ebl, ddb csp search techniques graphplan. paper reviews connections graphplan's planning-graph dynamic constraint satisfaction problem motivates need adapting csp search techniques graphplan algorithm. describes explanation based learning, dependency directed backtracking, dynamic variable ordering, forward checking, sticky values random-restart search strategies adapted graphplan. empirical results provided demonstrate augmentations improve graphplan's performance significantly (up 1000x speedups) several benchmark problems. special attention paid explanation-based learning dependency directed backtracking techniques empirically found useful improving performance graphplan.",4 "online object tracking proposal selection. tracking-by-detection approaches successful object trackers recent years. success largely determined detector model learn initially update time. however, challenging conditions object undergo transformations, e.g., severe rotation, methods found lacking. paper, address problem formulating proposal selection task making two contributions. first one introducing novel proposals estimated geometric transformations undergone object, building rich candidate set predicting object location. second one devising novel selection strategy using multiple cues, i.e., detection score edgeness score computed state-of-the-art object edges motion boundaries. extensively evaluate approach visual object tracking 2014 challenge online tracking benchmark datasets, show best performance.",4 "empirical analysis multiple-turn reasoning strategies reading comprehension tasks. reading comprehension (rc) challenging task requires synthesis information across sentences multiple turns reasoning. using state-of-the-art rc model, empirically investigate performance single-turn multiple-turn reasoning squad ms marco datasets. rc model end-to-end neural network iterative attention, uses reinforcement learning dynamically control number turns. find multiple-turn reasoning outperforms single-turn reasoning question answer types; further, observe enabling flexible number turns generally improves upon fixed multiple-turn strategy. %across question types, particularly beneficial questions lengthy, descriptive answers. achieve results competitive state-of-the-art two datasets.",4 "sharing hash codes multiple purposes. locality sensitive hashing (lsh) powerful tool sublinear-time approximate nearest neighbor search, variety hashing schemes proposed different dissimilarity measures. however, hash codes significantly depend dissimilarity, prohibits users adjusting dissimilarity query time. paper, propose {multiple purpose lsh (mp-lsh) shares hash codes different dissimilarities. mp-lsh supports l2, cosine, inner product dissimilarities, corresponding weighted sums, weights adjusted query time. also allows us modify importance pre-defined groups features. thus, mp-lsh enables us, example, retrieve similar items query user preference taken account, find similar material query properties (stability, utility, etc.) optimized, turn part multi-modal information (brightness, color, audio, text, etc.) image/video retrieval. theoretically empirically analyze performance three variants mp-lsh, demonstrate usefulness real-world data sets.",19 "learning non-gaussian time series using box-cox gaussian process. gaussian processes (gps) bayesian nonparametric generative models provide interpretability hyperparameters, admit closed-form expressions training inference, able accurately represent uncertainty. model general non-gaussian data complex correlation structure, gps paired expressive covariance kernel fed nonlinear transformation (or warping). however, overparametrising kernel warping known to, respectively, hinder gradient-based training make predictions computationally expensive. remedy issue (i) training model using derivative-free global-optimisation techniques find meaningful maxima model likelihood, (ii) proposing warping function based celebrated box-cox transformation requires minimal numerical approximations---unlike existing warped gp models. validate proposed approach first showing predictions computed analytically, learning, reconstruction forecasting experiment using real-world datasets.",19 "note sample complexity learning binary output neural networks fixed input distributions. show learning sample complexity sigmoidal neural network constructed sontag (1992) required achieve given misclassification error fixed purely atomic distribution grow arbitrarily fast: prescribed rate growth input distribution rate sample complexity, bound asymptotically tight. rate superexponential, non-recursive function, etc. observe sontag's ann glivenko-cantelli input distribution non-atomic part.",4 "invariant scattering convolution networks. wavelet scattering network computes translation invariant image representation, stable deformations preserves high frequency information classification. cascades wavelet transform convolutions non-linear modulus averaging operators. first network layer outputs sift-type descriptors whereas next layers provide complementary invariant information improves classification. mathematical analysis wavelet scattering networks explains important properties deep convolution networks classification. scattering representation stationary processes incorporates higher order moments thus discriminate textures fourier power spectrum. state art classification results obtained handwritten digits texture discrimination, using gaussian kernel svm generative pca classifier.",4 "interpretable 3d human action analysis temporal convolutional networks. discriminative power modern deep learning models 3d human action recognition growing ever potent. conjunction recent resurgence 3d human action representation 3d skeletons, quality pace recent progress significant. however, inner workings state-of-the-art learning based methods 3d human action recognition still remain mostly black-box. work, propose use new class models known temporal convolutional neural networks (tcn) 3d human action recognition. compared popular lstm-based recurrent neural network models, given interpretable input 3d skeletons, tcn provides us way explicitly learn readily interpretable spatio-temporal representations 3d human action recognition. provide strategy re-designing tcn interpretability mind characteristics model leveraged construct powerful 3d activity recognition method. work, wish take step towards spatio-temporal model easier understand, explain interpret. resulting model, res-tcn, achieves state-of-the-art results largest 3d human action recognition dataset, ntu-rgbd.",4 "generalized end-to-end loss speaker verification. paper, propose new loss function called generalized end-to-end (ge2e) loss, makes training speaker verification models efficient previous tuple-based end-to-end (te2e) loss function. unlike te2e, ge2e loss function updates network way emphasizes examples difficult verify step training process. additionally, ge2e loss require initial stage example selection. properties, model new loss function decreases speaker verification eer 10%, reducing training time 60% time. also introduce multireader technique, allows us domain adaptation - training accurate model supports multiple keywords (i.e. ""ok google"" ""hey google"") well multiple dialects.",6 "introduction bag features paradigm image classification retrieval. past decade seen growing popularity bag features (bof) approaches many computer vision tasks, including image classification, video search, robot localization, texture recognition. part appeal simplicity. bof methods based orderless collections quantized local image descriptors; discard spatial information therefore conceptually computationally simpler many alternative methods. despite this, perhaps this, bof-based systems set new performance standards popular image classification benchmarks achieved scalability breakthroughs image retrieval. paper presents introduction bof image representations, describes critical design choices, surveys bof literature. emphasis placed recent techniques mitigate quantization errors, improve feature detection, speed image retrieval. time, unresolved issues fundamental challenges raised. among unresolved issues determining best techniques sampling images, describing local image features, evaluating system performance. among fundamental challenges whether bof methods contribute localizing objects complex images, associating high-level semantics natural images. survey useful introducing new investigators field providing existing researchers consolidated reference related work.",4 "chases escapes, optimization problems. propose new approach solving combinatorial optimization problem utilizing mechanism chases escapes, long history mathematics. addition well-used steepest descent neighboring search, perform chase escape game ""landscape"" cost function. created concrete algorithm traveling salesman problem. preliminary test indicates possibility new fusion chases escapes problem combinatorial optimization search fruitful.",4 "permutation nmf. nonnegative matrix factorization(nmf) common used technique machine learning extract features data text documents images thanks natural clustering properties. particular, popular image processing since decompose several pictures recognize common parts they're located position photos. paper's aim present way add translation invariance classical nmf, is, algorithms presented able detect common features, even they're shifted, different original images.",4 "argumentation system reasoning conflict-minimal paraconsistent alc. semantic web open distributed environment hard guarantee consistency knowledge information. standard two-valued semantics everything entailed knowledge information inconsistent. semantics paraconsistent logic lp offers solution. however, available knowledge information consistent, set conclusions entailed three-valued semantics paraconsistent logic lp smaller set conclusions entailed two-valued semantics. preferring conflict-minimal three-valued interpretations eliminates difference. preferring conflict-minimal interpretations introduces non-monotonicity. handle non-monotonicity, paper proposes assumption-based argumentation system. assumptions needed close branches semantic tableaux form arguments. stable extensions set derived arguments correspond conflict minimal interpretations conclusions entailed conflict-minimal interpretations supported arguments stable extensions.",4 "nonparametric sparse representation. paper suggests nonparametric scheme find sparse solution underdetermined system linear equations presence unknown impulsive non-gaussian noise. approach robust variations noise model parameters. based minimization rank pseudo norm residual signal l_1-norm signal interest, simultaneously. use steepest descent method find sparse solution via iterative algorithm. simulation results show proposed method outperforms existence methods like omp, bp, lasso, bcs whenever observation vector contaminated measurement environmental non-gaussian noise unknown parameters. furthermore, low snr condition, proposed method better performance presence gaussian noise.",4 "hybrid approach hindi-english machine translation. paper, extended combined approach phrase based statistical machine translation (smt), example based mt (ebmt) rule based mt (rbmt) proposed develop novel hybrid data driven mt system capable outperforming baseline smt, ebmt rbmt systems derived. short, proposed hybrid mt process guided rule based mt getting set partial candidate translations provided ebmt smt subsystems. previous works shown ebmt systems capable outperforming phrase-based smt systems rbmt approach strength generating structurally morphologically accurate results. hybrid approach increases fluency, accuracy grammatical precision improve quality machine translation system. comparison proposed hybrid machine translation (htm) model renowned translators i.e. google, bing babylonian also presented shows proposed model works better sentences ambiguity well comprised idioms others.",4 "real-time distracted driver posture classification. distracted driving worldwide problem leading astoundingly increasing number accidents deaths. existing work concerned small set distractions (mostly, cell phone usage). also, part, uses unreliable ad-hoc methods detect distractions. paper, present first publicly available dataset ""distracted driver"" posture estimation distraction postures existing alternatives. addition, propose reliable system achieves 95.98% driving posture classification accuracy. system consists genetically-weighted ensemble convolutional neural networks (cnns). show weighted ensemble classifiers using genetic algorithm yields better classification confidence. also study effect different visual elements (i.e. hands face) distraction detection means face hand localizations. finally, present thinned version ensemble could achieve 94.29% classification accuracy operate real-time environment.",4 applications fuzzy logic case-based reasoning. article discusses applications fuzzy logic ideas formalizing case-based reasoning (cbr) process measuring effectiveness cbr systems,4 "spectral clustering jensen-type kernels multi-point extensions. motivated multi-distribution divergences, originate information theory, propose notion `multi-point' kernels, study applications. study class kernels based jensen type divergences show extended measure similarity among multiple points. study tensor flattening methods develop multi-point (kernel) spectral clustering (msc) method. emphasize special case proposed kernels, multi-point extension linear (dot-product) kernel show existence cubic time tensor flattening algorithm case. finally, illustrate usefulness contributions using standard data sets image segmentation tasks.",4 filament flare detection hα image sequences. solar storms major impact infrastructure earth. causing events observable ground h{\alpha} spectral line. paper propose new method simultaneous detection flares filaments h{\alpha} image sequences. therefore perform several preprocessing steps enhance normalize images. based intensity values segment image variational approach. final postprecessing step derive essential properties classify events demonstrate performance comparing obtained results data annotated expert. information produced method used near real-time alerts statistical analysis existing data solar physicists.,4 "replica exchange using q-gaussian swarm quantum particle intelligence method. present newly developed replica exchange algorithm using q -gaussian swarm quantum particle optimization (rex@q-gsqpo) method solving problem finding global optimum. basis algorithm run multiple copies independent swarms different values q parameter. based energy criterion, chosen satisfy detailed balance, swapping particle coordinates neighboring swarms regular iteration intervals. swarm replicas high q values characterized high diversity particles allowing escaping local minima faster, low q replicas, characterized low diversity particles, used sample efficiently local basins. compare new algorithm standard gaussian swarm quantum particle optimization (gsqpo) q-gaussian swarm quantum particle optimization (q-gsqpo) algorithms, found new algorithm robust terms number fitness function calls, efficient terms ability convergence global minimum. additional, also provide method optimally allocating swarm replicas among different q values. algorithm tested three benchmark functions, known multimodal problems, different dimensionalities. addition, considered polyalanine peptide 12 residues modeled using g\=o coarse-graining potential energy function.",4 "learning deep convolutional features mri based alzheimer's disease classification. effective accurate diagnosis alzheimer's disease (ad) mild cognitive impairment (mci) critical early treatment thus attracted attention nowadays. since first introduced, machine learning methods gaining increasing popularity ad related research. among various identified biomarkers, magnetic resonance imaging (mri) widely used prediction ad mci. however, machine learning algorithm applied, image features need extracted represent mri images. good representations pivotal classification performance, almost previous studies typically rely human labelling find regions interest (roi) may correlated ad, hippocampus, amygdala, precuneus, etc. procedure requires domain knowledge costly tedious. instead relying extraction roi features, promising remove manual roi labelling pipeline directly work raw mri images. words, let machine learning methods figure informative discriminative image structures ad classification. work, propose learn deep convolutional image features using unsupervised supervised learning. deep learning emerged powerful tool machine learning community successfully applied various tasks. thus propose exploit deep features mri images based pre-trained large convolutional neural network (cnn) ad mci classification, spares effort manual roi annotation process.",4 "feature importance bayesian assessment newborn brain maturity eeg. methodology bayesian model averaging (bma) applied assessment newborn brain maturity sleep eeg. theory methodology provides accurate assessments uncertainty decisions. however, existing bma techniques shown providing biased assessments absence prior information enabling explore model parameter space details within reasonable time. lack details leads disproportional sampling posterior distribution. case eeg assessment brain maturity, bma results biased absence information eeg feature importance. paper explore posterior information eeg features used order reduce negative impact disproportional sampling bma performance. use eeg data recorded sleeping newborns test efficiency proposed bma technique.",4 "accnet: actor-coordinator-critic net ""learning-to-communicate"" deep multi-agent reinforcement learning. communication critical factor big multi-agent world stay organized productive. typically, previous multi-agent ""learning-to-communicate"" studies try predefine communication protocols use technologies tabular reinforcement learning evolutionary algorithm, generalize changing environment large collection agents. paper, propose actor-coordinator-critic net (accnet) framework solving ""learning-to-communicate"" problem. accnet naturally combines powerful actor-critic reinforcement learning technology deep learning technology. efficiently learn communication protocols even scratch partially observable environment. demonstrate accnet achieve better results several baselines continuous discrete action space environments. also analyse learned protocols discuss design considerations.",4 "random binary mappings kernel learning efficient svm. support vector machines (svms) powerful learners led state-of-the-art results various computer vision problems. svms suffer various drawbacks terms selecting right kernel, depends image descriptors, well computational memory efficiency. paper introduces novel kernel, serves issues well. kernel learned exploiting large amount low-complex, randomized binary mappings input feature. leads efficient svm, also alleviating task kernel selection. demonstrate capabilities kernel 6 standard vision benchmarks, combine several common image descriptors, namely histograms (flowers17 daimler), attribute-like descriptors (uci, osr, a-voc08), sparse quantization (imagenet). results show kernel learning adapts well different descriptors types, achieving performance kernels specifically tuned image descriptor, similar evaluation cost efficient svm methods.",4 "deep learning based large scale visual recommendation search e-commerce. paper, present unified end-to-end approach build large scale visual search recommendation system e-commerce. previous works targeted problems isolation. believe effective elegant solution could obtained tackling together. propose unified deep convolutional neural network architecture, called visnet, learn embeddings capture notion visual similarity, across several semantic granularities. demonstrate superiority approach task image retrieval, comparing state-of-the-art exact street2shop dataset. share design decisions trade-offs made deploying model power visual recommendations across catalog 50m products, supporting 2k queries second flipkart, india's largest e-commerce company. deployment solution yielded significant business impact, measured conversion-rate.",4 "fast convergent algorithms expectation propagation approximate bayesian inference. propose novel algorithm solve expectation propagation relaxation bayesian inference continuous-variable graphical models. contrast previous algorithms, method provably convergent. marrying convergent ep ideas (opper&winther 05) covariance decoupling techniques (wipf&nagarajan 08, nickisch&seeger 09), runs least order magnitude faster commonly used ep solver.",19 "code completion neural attention pointer networks. intelligent code completion become essential tool accelerate modern software development. facilitate effective code completion dynamically-typed programming languages, apply neural language models learning large codebases, investigate effectiveness attention mechanism code completion task. however, standard neural language models even attention mechanism cannot correctly predict out-of-vocabulary (oov) words thus restrict code completion performance. paper, inspired prevalence locally repeated terms program source code, recently proposed pointer networks reproduce words local context, propose pointer mixture network better predicting oov words code completion. based context, pointer mixture network learns either generate within-vocabulary word rnn component, copy oov word local context pointer component. experiments two benchmarked datasets demonstrate effectiveness attention mechanism pointer mixture network code completion task.",4 "using dissortative mating genetic algorithms track extrema dynamic deceptive functions. traditional genetic algorithms (gas) mating schemes select individuals crossover independently genotypic phenotypic similarities. nature, behaviour known random mating. however, non-random schemes - individuals mate according kinship likeness - common natural systems. previous studies indicate that, applied gas, negative assortative mating (a specific type non-random mating, also known dissortative mating) may improve performance (on speed reliability) wide range problems. dissortative mating maintains genetic diversity higher level run, fact frequently observed explanation dissortative gas ability escape local optima traps. dynamic problems, due specificities, demand special care tuning ga, diversity plays even crucial role tackling static ones. paper investigates behaviour dissortative mating gas, namely recently proposed adaptive dissortative mating ga (admga), dynamic trap functions. admga selects parents according hamming distance, via self-adjustable threshold value. method, keeping population diversity run, provides effective means deal dynamic problems. tests conducted deceptive nearly deceptive trap functions indicate admga able outperform gas, specifically designed tracking moving extrema, wide range tests, particularly effective speed change fast. comparing algorithm previously proposed dissortative ga, results show performance equivalent majority experiments, admga performs better solving hardest instances test set.",4 "seeing small faces robust anchor's perspective. paper introduces novel anchor design support anchor-based face detection superior scale-invariant performance, especially tiny faces. achieve this, explicitly address problem anchor-based detectors drop performance drastically faces tiny sizes, e.g. less 16x16 pixels. paper, investigate case. discover current anchor design cannot guarantee high overlaps tiny faces anchor boxes, increases difficulty training. new expected max overlapping (emo) score proposed theoretically explain low overlapping issue inspire several effective strategies new anchor design leading higher face overlaps, including anchor stride reduction new network architectures, extra shifted anchors, stochastic face shifting. comprehensive experiments show proposed method significantly outperforms baseline anchor-based detector, consistently achieving state-of-the-art results challenging face detection datasets competitive runtime speed.",4 "imprecise probability assessments conditional probabilities quasi additive classes conditioning events. paper, starting generalized coherent (i.e. avoiding uniform loss) intervalvalued probability assessment finite family conditional events, construct conditional probabilities quasi additive classes conditioning events consistent given initial assessment. quasi additivity assures coherence obtained conditional probabilities. order reach goal define finite sequence conditional probabilities exploiting theoretical results g-coherence. particular, use solutions finite sequence linear systems.",4 "towards end-to-end speech recognition deep convolutional neural networks. convolutional neural networks (cnns) effective models reducing spectral variations modeling spectral correlations acoustic features automatic speech recognition (asr). hybrid speech recognition systems incorporating cnns hidden markov models/gaussian mixture models (hmms/gmms) achieved state-of-the-art various benchmarks. meanwhile, connectionist temporal classification (ctc) recurrent neural networks (rnns), proposed labeling unsegmented sequences, makes feasible train end-to-end speech recognition system instead hybrid settings. however, rnns computationally expensive sometimes difficult train. paper, inspired advantages cnns ctc approach, propose end-to-end speech framework sequence labeling, combining hierarchical cnns ctc directly without recurrent connections. evaluating approach timit phoneme recognition task, show proposed model computationally efficient, also competitive existing baseline systems. moreover, argue cnns capability model temporal correlations appropriate context information.",4 "learning tensors reproducing kernel hilbert spaces multilinear spectral penalties. present general framework learn functions tensor product reproducing kernel hilbert spaces (tp-rkhss). methodology based novel representer theorem suitable existing well new spectral penalties tensors. functions tp-rkhs defined cartesian product finite discrete sets, particular, main problem formulation admits special case existing tensor completion problems. special cases include transfer learning multimodal side information multilinear multitask learning. latter case, kernel-based view instrumental derive nonlinear extensions existing model classes. give novel algorithm show experiments usefulness proposed extensions.",4 "intra-and-inter-constraint-based video enhancement based piecewise tone mapping. video enhancement plays important role various video applications. paper, propose new intra-and-inter-constraint-based video enhancement approach aiming 1) achieve high intra-frame quality entire picture multiple region-of-interests (rois) adaptively simultaneously enhanced, 2) guarantee inter-frame quality consistencies among video frames. first analyze features different rois create piecewise tone mapping curve entire frame intra-frame quality frame enhanced. introduce new inter-frame constraints improve temporal quality consistency. experimental results show proposed algorithm obviously outperforms state-of-the-art algorithms.",4 "survey visual analysis human motion applications. paper summarizes recent progress human motion analysis applications. beginning, reviewed motion capture systems representation model human's motion data. next, sketched advanced human motion data processing technologies, including motion data filtering, temporal alignment, segmentation. following parts overview state-of-the-art approaches action recognition dynamics measuring since two active research areas human motion analysis. last part discusses emerging applications human motion analysis healthcare, human robot interaction, security surveillance, virtual reality animation. promising research topics human motion analysis future also summarized last part.",4 "lexical analysis tool ambiguity support. lexical ambiguities naturally arise languages. present lamb, lexical analyzer produces lexical analysis graph describing possible sequences tokens found within input string. parsers process lexical analysis graphs discard sequence tokens produce valid syntactic sentence, therefore performing, together lamb, context-sensitive lexical analysis lexically-ambiguous language specifications.",4 "random weights texture generation one layer neural networks. recent work literature shown experimentally one use lower layers trained convolutional neural network (cnn) model natural textures. interestingly, also experimentally shown one layer random filters also model textures although less variability. paper ask question one layer cnns random filters effective generating textures? theoretically show one layer convolutional architectures (without non-linearity) paired energy function used previous literature, fact preserve modulate frequency coefficients manner random weights pretrained weights generate type images. based results analysis question whether similar properties hold case one uses one convolution layer non-linearity. show case relu non-linearity situations one input give minimum possible energy whereas case nonlinearity, always infinite solutions give minimum possible energy. thus show certain situations adding relu non-linearity generates less variable images.",4 "learning understand phrases embedding dictionary. distributional models learn rich semantic word representations success story recent nlp research. however, developing models learn useful representations phrases sentences proved far harder. propose using definitions found everyday dictionaries means bridging gap lexical phrasal semantics. neural language embedding models effectively trained map dictionary definitions (phrases) (lexical) representations words defined definitions. present two applications architectures: ""reverse dictionaries"" return name concept given definition description general-knowledge crossword question answerers. tasks, neural language embedding models trained definitions handful freely-available lexical resources perform well better existing commercial systems rely significant task-specific engineering. results highlight effectiveness neural embedding architectures definition-based training developing models understand phrases sentences.",4 "leveraging sparse dense feature combinations sentiment classification. neural networks one popular approaches many natural language processing tasks sentiment analysis. often outperform traditional machine learning models achieve state-of-art results tasks. however, many existing deep learning models complex, difficult train provide limited improvement simpler methods. propose simple, robust powerful model sentiment classification. model outperforms many deep learning models achieves comparable results deep learning models complex architectures sentiment analysis datasets. publish code online.",4 "synthesis supervised classification algorithm using intelligent statistical tools. fundamental task detecting foreground objects static dynamic scenes take best choice color system representation efficient technique background modeling. propose paper non-parametric algorithm dedicated segment detect objects color images issued football sports meeting. indeed segmentation pixel concern many applications revealed method robust detect objects, even presence strong shadows highlights. hand refine playing strategy football, handball, volley ball, rugby..., coach need maximum technical-tactics information on-going game players. propose paper range algorithms allowing resolution many problems appearing automated process team identification, player affected corresponding team relying visual data. developed system tested match tunisian national competition. work prominent many next computer vision studies detailed study.",4 "comparative analysis methods estimating axon diameter using dwi. importance studying brain microstructure described existing state art non-invasive methods investigation brain microstructure using diffusion weighted magnetic resonance imaging (dwi) studied. next step, cramer-rao lower bound (crlb) analysis described utilised assessment minimum estimation error uncertainty level different diffusion weighted magnetic resonance (dwmr) signal decay models. analyses performed considering best scenario which, assume models appropriate representation measured phenomena. includes study sensitivity estimations measurement model parameters. demonstrated none existing models achieve reasonable minimum uncertainty level typical measurement setup. end, practical obstacles achieving higher performance clinical experimental environments studied effects feasibility methods discussed.",4 "mip backend idp system. idp knowledge base system currently uses minisat(id) backend constraint programming (cp) solver. similar systems used mixed integer programming (mip) solver backend. however, far little known mip solver preferable. paper explores question. describes use cplex backend idp reports experiments comparing backends.",4 "sensitivity analysis (and practitioners' guide to) convolutional neural networks sentence classification. convolutional neural networks (cnns) recently achieved remarkably strong performance practically important task sentence classification (kim 2014, kalchbrenner 2014, johnson 2014). however, models require practitioners specify exact model architecture set accompanying hyperparameters, including filter region size, regularization parameters, on. currently unknown sensitive model performance changes configurations task sentence classification. thus conduct sensitivity analysis one-layer cnns explore effect architecture components model performance; aim distinguish important comparatively inconsequential design decisions sentence classification. focus one-layer cnns (to exclusion complex models) due comparative simplicity strong empirical performance, makes modern standard baseline method akin support vector machine (svms) logistic regression. derive practical advice extensive empirical results interested getting cnns sentence classification real world settings.",4 "partner units configuration problem: completing picture. partner units problem (pup) acknowledged hard benchmark problem logic programming community various industrial application fields like surveillance, electrical engineering, computer networks railway safety systems. however, computational complexity remained widely unclear far. paper provide missing complexity results making pup better exploitable benchmark testing. furthermore, present quickpup, heuristic search algorithm pup instances outperforms state-of-the-art solving approaches already use real world industrial configuration environments.",4 "opennmt: open-source toolkit neural machine translation. introduce open-source toolkit neural machine translation (nmt) support research model architectures, feature representations, source modalities, maintaining competitive performance, modularity reasonable training requirements.",4 "gf mathematics library. paper devoted present mathematics grammar library, system multilingual mathematical text processing. explain context originated, current design functionality current development goals. also present two prototype services comment possible future applications area artificial mathematics assistants.",4 "need good init. layer-sequential unit-variance (lsuv) initialization - simple method weight initialization deep net learning - proposed. method consists two steps. first, pre-initialize weights convolution inner-product layer orthonormal matrices. second, proceed first final layer, normalizing variance output layer equal one. experiment different activation functions (maxout, relu-family, tanh) show proposed initialization leads learning deep nets (i) produces networks test accuracy better equal standard methods (ii) least fast complex schemes proposed specifically deep nets fitnets (romero et al. (2015)) highway (srivastava et al. (2015)). performance evaluated googlenet, caffenet, fitnets residual nets state-of-the-art, close it, achieved mnist, cifar-10/100 imagenet datasets.",4 "attentional push: augmenting salience shared attention modeling. present novel visual attention tracking technique based shared attention modeling. proposed method models viewer participant activity occurring scene. go beyond image salience instead computing power image region pull attention it, also consider strength regions image push attention region question. use term attentional push refer power image regions direct manipulate attention allocation viewer. attention model presented incorporates attentional push cues standard image salience-based attention modeling algorithms improve ability predict viewers fixate. experimental evaluation validates significant improvements predicting viewers' fixations using proposed methodology static dynamic imagery.",4 "fixed budget dwell time spent scanning electron microscopy optimize image quality?. scanning electron microscopy, achievable image quality often limited maximum feasible acquisition time per dataset. particularly regard three-dimensional large field-of-view imaging, compromise must found high amount shot noise, leads low signal-to-noise ratio, excessive acquisition times. assuming fixed acquisition time per frame, compared three different strategies algorithm-assisted image acquisition scanning electron microscopy. evaluated (1) raster scanning reduced dwell time per pixel followed state-of-the-art denoising algorithm, (2) raster scanning decreased resolution conjunction state-of-the-art super resolution algorithm, (3) sparse scanning approach fixed percentage pixels visited beam combination state-of-the-art inpainting algorithms. additionally, considered increased beam currents strategies. experiments showed sparse scanning using appropriate reconstruction technique superior strategies.",4 "phase tv based convex sets blind deconvolution microscopic images. article, two closed convex sets blind deconvolution problem proposed. blurring functions microscopy symmetric respect origin. therefore, modify phase fourier transform (ft) original image. result blurred image original image ft phase. therefore, set images prescribed ft phase used constraint set blind deconvolution problems. another convex set used image reconstruction process epigraph set total variation (tv) function. set need prescribed upper bound total variation image. upper bound automatically adjusted according current image restoration process. two closed convex sets used part blind deconvolution algorithm. simulation examples presented.",12 "learning peptide-protein binding affinity predictor kernel ridge regression. propose specialized string kernel small bio-molecules, peptides pseudo-sequences binding interfaces. kernel incorporates physico-chemical properties amino acids elegantly generalize eight kernels, oligo, weighted degree, blended spectrum, radial basis function. provide low complexity dynamic programming algorithm exact computation kernel linear time algorithm approximation. combined kernel ridge regression supck, novel binding pocket kernel, proposed kernel yields biologically relevant good prediction accuracy pepx database. first time, machine learning predictor capable accurately predicting binding affinity peptide protein. method also applied single-target pan-specific major histocompatibility complex class ii benchmark datasets three quantitative structure affinity model benchmark datasets. benchmarks, method significantly (p-value < 0.057) outperforms current state-of-the-art methods predicting peptide-protein binding affinities. proposed approach flexible applied predict quantitative biological activity. method value large segment research community potential accelerate peptide-based drug vaccine development.",16 "wide-residual-inception networks real-time object detection. since convolutional neural network(cnn)models emerged,several tasks computer vision actively deployed cnn models feature extraction. however,the conventional cnn models high computational cost require high memory capacity, impractical unaffordable commercial applications real-time on-road object detection embedded boards mobile platforms. tackle limitation cnn models, paper proposes wide-residual-inception (wr-inception) network, constructs architecture based residual inception unit captures objects various sizes feature map, well shallower wider layers, compared state-of-the-art networks like resnet. verify proposed networks, paper conducted two experiments; one classification task cifar-10/100 on-road object detection task using single-shot multi-box detector(ssd) kitti dataset.",4 "augmented artificial intelligence: conceptual framework. artificial intelligence (ai) systems make errors. errors unexpected, differ often typical human mistakes (""non-human"" errors). ai errors corrected without damage existing skills and, hopefully, avoiding direct human expertise. paper presents initial summary report project taking new systematic approach improving intellectual effectiveness individual ai communities ais. combine ideas learning heterogeneous multiagent systems new original mathematical approaches non-iterative corrections errors legacy ai systems. new stochastic separation theorems demonstrate corrector technology used handle errors data flows general probability distributions far away classical i.i.d. hypothesis.in particular, analysis mathematical foundations ai non-destructive correction, answer one general problem published donoho tanner 2009.",4 "automatic photo adjustment using deep neural networks. photo retouching enables photographers invoke dramatic visual impressions artistically enhancing photos stylistic color tone adjustments. however, also time-consuming challenging task requires advanced skills beyond abilities casual photographers. using automated algorithm appealing alternative manual work algorithm faces many hurdles. many photographic styles rely subtle adjustments depend image content even semantics. further, adjustments often spatially varying. characteristics, existing automatic algorithms still limited cover subset challenges. recently, deep machine learning shown unique abilities address hard problems resisted machine algorithms long. motivated us explore use deep learning context photo editing. paper, explain formulate automatic photo adjustment problem way suitable approach. also introduce image descriptor accounts local semantics image. experiments demonstrate deep learning formulation applied using descriptors successfully capture sophisticated photographic styles. particular unlike previous techniques, model local adjustments depend image semantics. show several examples yields results qualitatively quantitatively better previous work.",4 "process-oriented iterative multiple alignment medical process mining. adapted biological sequence alignment, trace alignment process mining technique used visualize analyze workflow data. analysis done method, however, affected alignment quality. best existing trace alignment techniques use progressive guide-trees heuristically approximate optimal alignment o(n2l2) time. algorithms heavily dependent selected guide-tree metric, often return sum-of-pairs-score-reducing errors interfere interpretation, computationally intensive large datasets. alleviate issues, propose process-oriented iterative multiple alignment (pima), contains specialized optimizations better handle workflow data. demonstrate pima flexible framework capable achieving better sum-of-pairs score existing trace alignment algorithms o(nl2) time. applied pima analyzing medical workflow data, showing iterative alignment better represent data facilitate extraction insights data visualization.",4 "multi-domain neural network language generation spoken dialogue systems. moving limited-domain natural language generation (nlg) open domain difficult number semantic input combinations grows exponentially number domains. therefore, important leverage existing resources exploit similarities domains facilitate domain adaptation. paper, propose procedure train multi-domain, recurrent neural network-based (rnn) language generators via multiple adaptation steps. procedure, model first trained counterfeited data synthesised out-of-domain dataset, fine tuned small set in-domain utterances discriminative objective function. corpus-based evaluation results show proposed procedure achieve competitive performance terms bleu score slot error rate significantly reducing data needed train generators new, unseen domains. subjective testing, human judges confirm procedure greatly improves generator performance small amount data available domain.",4 "handwritten digit recognition bio-inspired hierarchical networks. human brain processes information showing learning prediction abilities underlying neuronal mechanisms still remain unknown. recently, many studies prove neuronal networks able generalizations associations sensory inputs. paper, following set neurophysiological evidences, propose learning framework strong biological plausibility mimics prominent functions cortical circuitries. developed inductive conceptual network (icn), hierarchical bio-inspired network, able learn invariant patterns variable-order markov models implemented nodes. outputs top-most node icn hierarchy, representing highest input generalization, allow automatic classification inputs. found icn clusterized mnist images error 5.73% usps images error 12.56%.",4 "differential methods catadioptric sensor design applications panoramic imaging. discuss design techniques catadioptric sensors realize given projections. general, problems solutions, approximate solutions may often found visually acceptable. several methods approach problem, focus call ``vector field approach''. application given true panoramic mirror derived, i.e. mirror yields cylindrical projection viewer without digital unwarping.",4 "survey naïve bayes machine learning approach text document classification. text document classification aims associating one predefined categories based likelihood suggested training set labeled documents. many machine learning algorithms play vital role training system predefined categories among na\""ive bayes intriguing facts simple, easy implement draws better accuracy large datasets spite na\""ive dependence. importance na\""ive bayes machine learning approach felt hence study taken text document classification statistical event models available. survey various feature selection methods discussed compared along metrics related text document classification.",4 "semi-blind sparse image reconstruction application mrfm. propose solution image deconvolution problem convolution kernel point spread function (psf) assumed partially known. small perturbations generated model exploited produce principal components explaining psf uncertainty high dimensional space. unlike recent developments blind deconvolution natural images, assume image sparse pixel basis, natural sparsity arising magnetic resonance force microscopy (mrfm). approach adopts bayesian metropolis-within-gibbs sampling framework. performance bayesian semi-blind algorithm sparse images superior previously proposed semi-blind algorithms alternating minimization (am) algorithm blind algorithms developed natural images. illustrate myopic algorithm real mrfm tobacco virus data.",15 "inferring taxi status using gps trajectories. paper, infer statuses taxi, consisting occupied, non-occupied parked, terms gps trajectory. status information enable urban computing improving city's transportation systems land use planning. solution, first identify extract set effective features incorporating knowledge single trajectory, historical trajectories geographic data like road network. second, parking status detection algorithm devised find parking places (from given trajectory), dividing trajectory segments (i.e., sub-trajectories). third, propose two-phase inference model learn status (occupied non-occupied) point taxi segment. model first uses identified features train local probabilistic classifier carries hidden semi-markov model (hsmm) globally considering long term travel patterns. evaluated method large-scale real-world trajectory dataset generated 600 taxis, showing advantages method baselines.",4 "camera identification grouping images database, based shared noise patterns. previous research showed camera specific noise patterns, so-called prnu-patterns, extracted images related images could found. particular research focus grouping images database, based shared noise pattern identification method cameras. using method described article, groups images, created using camera, could linked large database images. using matlab programming, relevant image noise patterns extracted images much quicker common methods use faster noise extraction filters improvements reduce calculation costs. relating noise patterns, correlation certain threshold value, quickly matched. hereby, database images, groups relating images could linked method could used scan large number images suspect noise patterns.",4 "aba+: assumption-based argumentation preferences. present aba+, new approach handling preferences well known structured argumentation formalism, assumption-based argumentation (aba). aba+, preference information given assumptions incorporated directly attack relation, thus resulting attack reversal. aba+ conservatively extends aba exhibits various desirable features regarding relationship among argumentation semantics well preference handling. also introduce weak contraposition, principle concerning reasoning rules preferences relaxes standard principle contraposition, guaranteeing additional desirable features aba+.",4 "identifying unknown unknowns open world: representations policies guided exploration. predictive models deployed real world may assign incorrect labels instances high confidence. errors unknown unknowns rooted model incompleteness, typically arise mismatch training data cases encountered test time. models blind errors, input oracle needed identify failures. paper, formulate address problem informed discovery unknown unknowns given predictive model unknown unknowns occur due systematic biases training data. propose model-agnostic methodology uses feedback oracle identify unknown unknowns intelligently guide discovery. employ two-phase approach first organizes data multiple partitions based feature similarity instances confidence scores assigned predictive model, utilizes explore-exploit strategy discovering unknown unknowns across partitions. demonstrate efficacy framework varying underlying causes unknown unknowns across various applications. best knowledge, paper presents first algorithmic approach problem discovering unknown unknowns predictive models.",4 "difficulty selecting ising models approximate recovery. paper, consider problem estimating underlying graph associated ising model given number independent identically distributed samples. adopt \emph{approximate recovery} criterion allows number missed edges incorrectly-included edges, contrast widely-studied exact recovery problem. main results provide information-theoretic lower bounds sample complexity graph classes imposing constraints number edges, maximal degree, properties. identify broad range scenarios where, either constant factors logarithmic factors, lower bounds match best known lower bounds exact recovery criterion, several known tight near-tight. hence, cases, approximate recovery similar difficulty exact recovery minimax sense. bounds obtained via modification fano's inequality handling approximate recovery criterion, along suitably-designed ensembles graphs broadly classed two categories: (i) containing graphs contain several isolated edges cliques thus difficult distinguish empty graph; (ii) containing graphs certain groups nodes highly correlated, thus making difficult determine precisely edges connect them. support theoretical results ensembles numerical experiments.",4 "creative robot dance variational encoder. appreciate dance ability people sponta- neously improvise new movements choreographies, sur- rendering music rhythm, inspired cur- rent perceptions sensations previous experiences, deeply stored memory. like human abilities, this, course, challenging reproduce artificial entity robot. recent generations anthropomor- phic robots, so-called humanoids, however, exhibit sophisticated skills raised interest robotic communities design experiment systems devoted automatic dance generation. work, highlight importance model computational creativity behavior dancing robots avoid mere execution preprogrammed dances. particular, exploit deep learning approach allows robot generate real time new dancing move- ments according listened music.",4 "translation ""zur ermittlung eines objektes aus zwei perspektiven mit innerer orientierung"" erwin kruppa (1913). erwin kruppa's 1913 paper, erwin kruppa, ""zur ermittlung eines objektes aus zwei perspektiven mit innerer orientierung"", sitzungsberichte der mathematisch-naturwissenschaftlichen kaiserlichen akademie der wissenschaften, vol. 122 (1913), pp. 1939-1948, may translated ""to determine 3d object two perspective views known inner orientation"", landmark paper computer vision provides first five-point algorithm relative pose estimation. kruppa showed (a finite number solutions for) relative pose two calibrated images rigid object computed five point matches images. kruppa's work also gained attention topic camera self-calibration, presented (maybank faugeras, 1992). since paper still relevant today (more hundred citations within last ten years) paper available online, ordered copy german national library frankfurt provide english translation along german original. also adapt terminology modern jargon provide clarifications (highlighted sans-serif font). historical review geometric computer vision, reader referred recent survey paper (sturm, 2011).",4 "neural autoregressive approach collaborative filtering. paper proposes cf-nade, neural autoregressive architecture collaborative filtering (cf) tasks, inspired restricted boltzmann machine (rbm) based cf model neural autoregressive distribution estimator (nade). first describe basic cf-nade model cf tasks. propose improve model sharing parameters different ratings. factored version cf-nade also proposed better scalability. furthermore, take ordinal nature preferences consideration propose ordinal cost optimize cf-nade, shows superior performance. finally, cf-nade extended deep model, moderately increased computational complexity. experimental results show cf-nade single hidden layer beats previous state-of-the-art methods movielens 1m, movielens 10m, netflix datasets, adding hidden layers improve performance.",4 "depth-width tradeoffs approximating natural functions neural networks. provide several new depth-based separation results feed-forward neural networks, proving various types simple natural functions better approximated using deeper networks shallower ones, even shallower networks much larger. includes indicators balls ellipses; non-linear functions radial respect $l_1$ norm; smooth non-linear functions. also show gaps observed experimentally: increasing depth indeed allows better learning increasing width, training neural networks learn indicator unit ball.",4 "skeleton-based action recognition convolutional neural networks. current state-of-the-art approaches skeleton-based action recognition mostly based recurrent neural networks (rnn). paper, propose novel convolutional neural networks (cnn) based framework action classification detection. raw skeleton coordinates well skeleton motion fed directly cnn label prediction. novel skeleton transformer module designed rearrange select important skeleton joints automatically. simple 7-layer network, obtain 89.3% accuracy validation set ntu rgb+d dataset. action detection untrimmed videos, develop window proposal network extract temporal segment proposals, classified within network. recent pku-mmd dataset, achieve 93.7% map, surpassing baseline large margin.",4 "end-to-end audiovisual speech recognition. several end-to-end deep learning approaches recently presented extract either audio visual features input images audio signals perform speech recognition. however, research end-to-end audiovisual models limited. work, present end-to-end audiovisual model based residual networks bidirectional gated recurrent units (bgrus). best knowledge, first audiovisual fusion model simultaneously learns extract features directly image pixels audio waveforms performs within-context word recognition large publicly available dataset (lrw). model consists two streams, one modality, extract features directly mouth regions raw waveforms. temporal dynamics stream/modality modeled 2-layer bgru fusion multiple streams/modalities takes place via another 2-layer bgru. slight improvement classification rate end-to-end audio-only mfcc-based model reported clean audio conditions low levels noise. presence high levels noise, end-to-end audiovisual model significantly outperforms audio-only models.",4 "maximum principle based algorithms deep learning. continuous dynamical system approach deep learning explored order devise alternative frameworks training algorithms. training recast control problem allows us formulate necessary optimality conditions continuous time using pontryagin's maximum principle (pmp). modification method successive approximations used solve pmp, giving rise alternative training algorithm deep learning. approach advantage rigorous error estimates convergence results established. also show may avoid pitfalls gradient-based methods, slow convergence flat landscapes near saddle points. furthermore, demonstrate obtains favorable initial convergence rate per-iteration, provided hamiltonian maximization efficiently carried - step still need improvement. overall, approach opens new avenues attack problems associated deep learning, trapping slow manifolds inapplicability gradient-based methods discrete trainable variables.",4 "kinship verification videos using spatio-temporal texture features deep learning. automatic kinship verification using facial images relatively new challenging research problem computer vision. consists automatically predicting whether two persons biological kin relation examining facial attributes. existing works extract shallow handcrafted features still face images, approach problem spatio-temporal point view explore use shallow texture features deep features characterizing faces. promising results, especially deep features, obtained benchmark uva-nemo smile database. extensive experiments also show superiority using videos still images, hence pointing important role facial dynamics kinship verification. furthermore, fusion two types features (i.e. shallow spatio-temporal texture features deep features) shows significant performance improvements compared state-of-the-art methods.",4 "prediction sea surface temperature using long short-term memory. letter adopts long short-term memory(lstm) predict sea surface temperature(sst), first attempt, knowledge, use recurrent neural network solve problem sst prediction, make one week one month daily prediction. formulate sst prediction problem time series regression problem. lstm special kind recurrent neural network, introduces gate mechanism vanilla rnn prevent vanished exploding gradient problem. strong ability model temporal relationship time series data handle long-term dependency problem well. proposed network architecture composed two kinds layers: lstm layer full-connected dense layer. lstm layer utilized model time series relationship. full-connected layer utilized map output lstm layer final prediction. explore optimal setting architecture experiments report accuracy coastal seas china confirm effectiveness proposed method. addition, also show online updated characteristics.",4 "recommending agenda: active learning private attributes using matrix factorization. recommender systems leverage user demographic information, age, gender, etc., personalize recommendations better place targeted ads. oftentimes, users volunteer information due privacy concerns, due lack initiative filling online profiles. illustrate new threat recommender learns private attributes users voluntarily disclose them. design passive active attacks solicit ratings strategically selected items, could thus used recommender system pursue hidden agenda. methods based novel usage bayesian matrix factorization active learning setting. evaluations multiple datasets illustrate attacks indeed feasible use significantly fewer rated items static inference methods. importantly, succeed without sacrificing quality recommendations users.",4 "discriminative neural sentence modeling tree-based convolution. paper proposes tree-based convolutional neural network (tbcnn) discriminative sentence modeling. models leverage either constituency trees dependency trees sentences. tree-based convolution process extracts sentences' structural features, features aggregated max pooling. architecture allows short propagation paths output layer underlying feature detectors, enables effective structural feature learning extraction. evaluate models two tasks: sentiment analysis question classification. experiments, tbcnn outperforms previous state-of-the-art results, including existing neural networks dedicated feature/rule engineering. also make efforts visualize tree-based convolution process, shedding light models work.",4 "revealing autonomous system taxonomy: machine learning approach. although internet as-level topology extensively studied past years, little known details taxonomy. ""node"" represent wide variety organizations, e.g., large isp, small private business, university, vastly different network characteristics, external connectivity patterns, network growth tendencies, properties hardly neglect working veracious internet representations simulation environments. paper, introduce radically new approach based machine learning techniques map ases internet natural taxonomy. successfully classify 95.3% ases expected accuracy 78.1%. release community as-level topology dataset augmented with: 1) taxonomy information 2) set attributes used classify ases. believe dataset serve invaluable addition understanding structure evolution internet.",4 "general framework interacting bayes-optimally self-interested agents using arbitrary parametric model model prior. recent advances bayesian reinforcement learning (brl) shown bayes-optimality theoretically achievable modeling environment's latent dynamics using flat-dirichlet-multinomial (fdm) prior. self-interested multi-agent environments, transition dynamics mainly controlled agent's stochastic behavior fdm's independence modeling assumptions hold. result, fdm allow agent's behavior generalized across different states specified using prior domain knowledge. overcome practical limitations fdm, propose generalization brl integrate general class parametric models model priors, thus allowing practitioners' domain knowledge exploited produce fine-grained compact representation agent's behavior. empirical evaluation shows approach outperforms existing multi-agent reinforcement learning algorithms.",4 "new hybrid metric verifying parallel corpora arabic-english. paper discusses new metric applied verify quality translation sentence pairs parallel corpora arabic-english. metric combines two techniques, one based sentence length based compression code length. experiments sample test parallel arabic-english corpora indicate combination two techniques improves accuracy identification satisfactory unsatisfactory sentence pairs compared sentence length compression code length alone. new method proposed research effective filtering noise reducing mis-translations resulting greatly improved quality.",4 "asp minimal entailment rational extension sroel. paper exploit answer set programming (asp) reasoning rational extension sroel-r-t low complexity description logic sroel, underlies owl el ontology language. extended language, typicality operator allowed define concepts t(c) (typical c's) rational semantics. proven instance checking rational entailment polynomial complexity. strengthen rational entailment, paper consider minimal model semantics. show that, arbitrary sroel-r-t knowledge bases, instance checking minimal entailment \pi^p_2-complete. relying small model result, models correspond answer sets suitable asp encoding, exploit answer set preferences (and, particular, asprin framework) reasoning minimal entailment. paper consideration acceptance theory practice logic programming.",4 "using deep learning reveal neural code images primary visual cortex. primary visual cortex (v1) first stage cortical image processing, major effort systems neuroscience devoted understanding encodes information visual stimuli. within v1, many neurons respond selectively edges given preferred orientation: known simple complex cells, well-studied. neurons respond localized center-surround image features. still others respond selectively certain image stimuli, specific features excite unknown. moreover, even simple complex cells-- best-understood v1 neurons-- challenging predict respond natural image stimuli. thus, important gaps understanding v1 encodes images. fill gap, train deep convolutional neural networks predict firing rates v1 neurons response natural image stimuli, find 15% neurons within 10% theoretical limit predictability. well predicted neurons, invert predictor network identify image features (receptive fields) cause v1 neurons spike. addition previously-characterized receptive fields (gabor wavelet center-surround), identify neurons respond predictably higher-level textural image features localized particular region image.",16 "negative results computer vision: perspective. negative result outcome experiment model expected hypothesis hold. despite often overlooked scientific community, negative results results carry value. topic extensively discussed fields social sciences biosciences, less attention paid computer vision community. unique characteristics computer vision, particularly experimental aspect, call special treatment matter. paper, address makes negative results important, disseminated incentivized, lessons learned cognitive vision research regard. further, discuss issues computer vision human vision interaction, experimental design statistical hypothesis testing, explanatory versus predictive modeling, performance evaluation, model comparison, well computer vision research culture.",4 "group event detection varying number group members video surveillance. paper presents novel approach automatic recognition group activities video surveillance applications. propose use group representative handle recognition varying number group members, use asynchronous hidden markov model (ahmm) model relationship people. furthermore, propose group activity detection algorithm handle symmetric asymmetric group activities, demonstrate approach enables detection hierarchical interactions people. experimental results show effectiveness approach.",4 "retinal vasculature segmentation using local saliency maps generative adversarial networks image super resolution. propose image super resolution(isr) method using generative adversarial networks (gans) takes low resolution input fundus image generates high resolution super resolved (sr) image upto scaling factor $16$. facilitates accurate automated image analysis, especially small blurred landmarks pathologies. local saliency maps, define pixel's importance, used define novel saliency loss gan cost function. experimental results show resulting sr images perceptual quality close original images perform better competing methods weigh pixels according importance. used retinal vasculature segmentation, sr images result accuracy levels close obtained using original images.",4 "approximation two-part mdl code. approximation optimal two-part mdl code given data, successive monotonically length-decreasing two-part mdl codes, following properties: (i) computation step may take arbitrarily long; (ii) may know reach optimum, whether reach optimum all; (iii) sequence models generated may monotonically improve goodness fit; (iv) model associated optimum (almost) best goodness fit. express practically interesting goodness fit individual models individual data sets rely kolmogorov complexity.",4 "deductive analogical reasoning semantically embedded knowledge graph. representing knowledge high-dimensional vectors continuous semantic vector space help overcome brittleness incompleteness traditional knowledge bases. present method performing deductive reasoning directly vector space, combining analogy, association, deduction straightforward way step chain reasoning, drawing knowledge diverse sources ontologies.",4 "framework compiling preferences logic programs. introduce methodology framework expressing general preference information logic programming answer set semantics. ordered logic program extended logic program rules named unique terms, preferences among rules given set atoms form < names. ordered logic program transformed second, regular, extended logic program wherein preferences respected, answer sets obtained transformed program correspond preferred answer sets original program. approach allows specification dynamic orderings, preferences appear arbitrarily within program. static orderings (in preferences external logic program) trivial restriction general dynamic case. first, develop specific approach reasoning preferences, wherein preference ordering specifies order rules applied. demonstrate wide range applicability framework showing approaches, among brewka eiter, captured within framework. since result transformations extended logic program, make use existing implementations, dlv smodels. end, developed publicly available compiler front-end programming systems.",4 "generative model volume rendering. present technique synthesize analyze volume-rendered images using generative models. use generative adversarial network (gan) framework compute model large collection volume renderings, conditioned (1) viewpoint (2) transfer functions opacity color. approach facilitates tasks volume analysis challenging achieve using existing rendering techniques ray casting texture-based methods. show guide user transfer function editing quantifying expected change output image. additionally, generative model transforms transfer functions view-invariant latent space specifically designed synthesize volume-rendered images. use space directly rendering, enabling user explore space volume-rendered images. model independent choice volume rendering process, show analyze volume-rendered images produced direct global illumination lighting, variety volume datasets.",4 "speech enhancement using pitch detection approach noisy environment. acoustical mismatch among training testing phases degrades outstandingly speech recognition results. problem limited development real-world nonspecific applications, testing conditions highly variant even unpredictable training process. therefore background noise removed noisy speech signal increase signal intelligibility reduce listener fatigue. enhancement techniques applied, pre-processing stages; systems remarkably improve recognition results. paper, novel approach used enhance perceived quality speech signal additive noise cannot directly controlled. instead controlling background noise, propose reinforce speech signal heard clearly noisy environments. subjective evaluation shows proposed method improves perceptual quality speech various noisy environments. cases speaking may convenient typing, even rapid typists: many mathematical symbols missing keyboard easily spoken recognized. therefore, proposed system used application designed mathematical symbol recognition (especially symbols available keyboard) schools.",4 "dynamic nonlocal language modeling via hierarchical topic-based adaptation. paper presents novel method generating applying hierarchical, dynamic topic-based language models. proposes evaluates new cluster generation, hierarchical smoothing adaptive topic-probability estimation techniques. combined models help capture long-distance lexical dependencies. experiments broadcast news corpus show significant improvement perplexity (10.5% overall 33.5% target vocabulary).",4 "specifying non-markovian rewards mdps using ldl finite traces (preliminary version). markov decision processes (mdps), reward obtained state depends properties last state action. state dependency makes difficult reward interesting long-term behaviors, always closing door opened, providing coffee following request. extending mdps handle non-markovian reward function subject two previous lines work, using variants ltl specify reward function compiling new model back markovian model. building upon recent progress theories temporal logics finite traces, adopt ldlf specifying non-markovian rewards provide elegant automata construction building markovian model, extends previous work offers strong minimality compositionality guarantees.",4 "theory unified relativity biovielectroluminescence phenomenon via fly's visual imaging system. elucidation upon fly's neuronal patterns link computer graphics memory cards i/o's, investigated phenomenon propounding unified theory einstein's two known relativities. conclusive flies could contribute certain amount neuromatrices indicating imagery function visual-computational system computer graphics storage systems. visual system involves time aspect, whereas flies possess faster pulses compared humans' visual ability due e-field state active fly's eye surface. behaviour tested dissected fly specimen ommatidia. electro-optical contacts electrodes wired flesh forming organic emitter layer stimulate light emission, thereby computer circuit. next step applying threshold voltage secondary voltages circuit denoting array essential electrodes bit switch. result, circuit's dormant pulses versus active pulses specimen's area recorded. outcome matrix possesses construction rgb time radicals expressing time problem consumption, allocating time computational algorithms, enhancing technology far beyond. obtained formulation generates consumed distance cons(x), denoting circuital travel data source/sink pixel data bendable wavelengths. 'image logic' place, incorporating point graphical acceleration permits one enhance graphics optimize immensely central processing, data transmissions memory computer visual system. phenomenon mainly used 360-deg. display/viewing, 3d scanning techniques, military medicine, robust cheap substitution e.g. pre-motion pattern analysis, real-time rendering lcds.",4 "highly automated learning improved active safety vulnerable road users. highly automated driving requires precise models traffic participants. many state art models currently based machine learning techniques. among others, required amount labeled data one major challenge. autonomous learning process addressing problem proposed. initial models iteratively refined three steps: (1) detection context identification, (2) novelty detection active learning (3) online model adaption.",4 "novel strategy selection method multi-objective clustering algorithms using game theory. important factors contribute efficiency game-theoretical algorithms time game complexity. study, offered elegant method deal high complexity game theoretic multi-objective clustering methods large-sized data sets. here, developed method selects subset strategies strategies profile player. case, size payoff matrices reduces significantly remarkable impact time complexity. therefore, practical problems data tractable less computational complexity. although strategies set may grow increasing number data points, presented model strategy selection reduces strategy space, considerably, clusters subdivided several sub-clusters local game. remarkable results demonstrate efficiency presented approach reducing computational complexity problem concern.",4 "group theory, group actions, evolutionary algorithms, global optimization. paper use group, action orbit understand evolutionary solve nonconvex optimization problems.",4 "weaving multi-scale context single shot detector. aggregating context information multiple scales proved effective improving accuracy single shot detectors (ssds) object detection. however, existing multi-scale context fusion techniques computationally expensive, unfavorably diminishes advantageous speed ssd. work, propose novel network topology, called weavenet, efficiently fuse multi-scale information boost detection accuracy negligible extra cost. proposed weavenet iteratively weaves context information adjacent scales together enable sophisticated context reasoning maintaining fast speed. built stacking light-weight blocks, weavenet easy train without requiring batch normalization accelerated proposed architecture simplification. experimental results pascal voc 2007, pascal voc 2012 benchmarks show signification performance boost brought weavenet. 320x320 input batch size = 8, weavenet reaches 79.5% map pascal voc 2007 test 101 fps 4 fps extra cost, improves 79.7% map iterations.",4 "trax: visual tracking exchange protocol library. paper address problem developing on-line visual tracking algorithms. present specialized communication protocol serves bridge tracker implementation utilizing application. decouples development algorithms application, encouraging re-usability. primary use case algorithm evaluation protocol facilitates complex evaluation scenarios used nowadays thus pushing forward field visual tracking. present reference implementation protocol makes easy use several popular programming languages discuss protocol already used usage scenarios envision future.",4 "visual representation wittgenstein's tractatus logico-philosophicus. paper present data visualization method together potential usefulness digital humanities philosophy language. compile multilingual parallel corpus different versions wittgenstein's tractatus logico-philosophicus, including original german translations english, spanish, french, russian. using corpus, compute similarity measure propositions render visual network relations different languages.",4 "leveraging large amounts weakly supervised data multi-language sentiment classification. paper presents novel approach multi-lingual sentiment classification short texts. challenging task amount training data languages english limited. previously proposed multi-lingual approaches typically require establish correspondence english powerful classifiers already available. contrast, method require supervision. leverage large amounts weakly-supervised data various languages train multi-layer convolutional network demonstrate importance using pre-training networks. thoroughly evaluate approach various multi-lingual datasets, including recent semeval-2016 sentiment prediction benchmark (task 4), achieved state-of-the-art performance. also compare performance model trained individually language variant trained languages once. show latter model reaches slightly worse - still acceptable - performance compared single language model, benefiting better generalization properties across languages.",4 "learning maps: visual common sense autonomous driving. today's autonomous vehicles rely extensively high-definition 3d maps navigate environment. approach works well maps completely up-to-date, safe autonomous vehicles must able corroborate map's information via real time sensor-based system. goal work develop model road layout inference given imagery on-board cameras, without reliance high-definition maps. however, sufficient dataset training model exists. here, leverage availability standard navigation maps corresponding street view images construct automatically labeled, large-scale dataset complex scene understanding problem. matching road vectors metadata navigation maps google street view images, assign ground truth road layout attributes (e.g., distance intersection, one-way vs. two-way street) images. train deep convolutional networks predict road layout attributes given single monocular rgb image. experimental evaluation demonstrates model learns correctly infer road attributes using panoramas captured car-mounted cameras input. additionally, results indicate method may suitable novel application recommending safety improvements infrastructure (e.g., suggesting alternative speed limit street).",4 "expectation-propagation likelihood-free inference. many models interest natural social sciences closed-form likelihood function, means cannot treated using usual techniques statistical inference. case models efficiently simulated, bayesian inference still possible thanks approximate bayesian computation (abc) algorithm. although many refinements suggested, abc inference still far routine. abc often excruciatingly slow due low acceptance rates. addition, abc requires introducing vector ""summary statistics"", choice relatively arbitrary, often require trial error, making whole process quite laborious user. introduce work ep-abc algorithm, adaptation likelihood-free context variational approximation algorithm known expectation propagation (minka, 2001). main advantage ep-abc faster orders magnitude standard algorithms, producing overall approximation error typically negligible. second advantage ep-abc replaces usual global abc constraint vector summary statistics computed whole dataset, n local constraints form apply separately data-point. consequence, often possible away summary statistics entirely. case, ep-abc approximates directly evidence (marginal likelihood) model. comparisons performed three real-world applications typical likelihood-free inference, including one application neuroscience novel, possibly challenging standard abc techniques.",19 "diffusion-convolutional neural networks. present diffusion-convolutional neural networks (dcnns), new model graph-structured data. introduction diffusion-convolution operation, show diffusion-based representations learned graph-structured data used effective basis node classification. dcnns several attractive qualities, including latent representation graphical data invariant isomorphism, well polynomial-time prediction learning represented tensor operations efficiently implemented gpu. several experiments real structured datasets, demonstrate dcnns able outperform probabilistic relational models kernel-on-graph methods relational node classification tasks.",4 "phase transition sonfis&sorst. study, introduce general frame many connected intelligent particles systems (macips). connections interconnections particles get complex behavior merely simple system (system system).contribution natural computing, information granulation theory, main topics spacious skeleton. upon clue, organize two algorithms involved prominent intelligent computing approximate reasoning methods: self organizing feature map (som), neuro- fuzzy inference system rough set theory (rst). this, show algorithms taken linkage government-society interaction, government catches various fashions behavior: solid (absolute) flexible. so, transition society, changing connectivity parameters (noise) order disorder inferred. add this, one may find indirect mapping among financial systems eventual market fluctuations macips. keywords: phase transition, sonfis, sorst, many connected intelligent particles system, society-government interaction",4 "depth monocular images using semi-parallel deep neural network (spdnn) hybrid architecture. convolutional neural network (cnn) techniques applied problem determining depth single camera image (monocular depth). fully connected cnn topologies preserve details input images, enabling detection fine details, miss larger features; networks employ 2x2, 4x4 8x8 max-pooling operators determine larger features expense finer details. designing, training optimising set topologies, networks may combined single network topology using graph optimization techniques. ""semi parallel deep neural network (spdnn)"" eliminates duplicate common network layers, reducing network size computational effort significantly, optimized retraining achieve improved level convergence individual topologies. study, four models trained evaluated 2 stages kitti dataset. ground truth images first part experiment come benchmark, second part, ground truth images depth map results applying state-of-the-art stereo matching method. results evaluation demonstrate using post-processing techniques refine target network increases accuracy depth estimation individual mono images. second evaluation shows using segmentation data input improve depth estimation results point performance comparable stereo depth estimation. computational time also discussed study.",4 "gaussian process regression student-t likelihood. paper considers robust efficient implementation gaussian process regression student-t observation model. challenge student-t model analytically intractable inference several approximative methods proposed. expectation propagation (ep) found accurate method many empirical studies convergence ep known problematic models containing non-log-concave site functions student-t distribution. paper illustrate situations standard ep fails converge review different modifications alternative algorithms improving convergence. demonstrate convergence problems may occur type-ii maximum posteriori (map) estimation hyperparameters show standard ep may converge map values difficult cases. present robust implementation relies primarily parallel ep updates utilizes moment-matching-based double-loop algorithm adaptively selected step size difficult cases. predictive performance ep compared laplace, variational bayes, markov chain monte carlo approximations.",19 "hyper-heuristics achieve optimal performance pseudo-boolean optimisation. selection hyper-heuristics randomised search methodologies choose execute heuristics set low-level heuristics. recent research leadingones benchmark function shown standard simple random, permutation, random gradient, greedy reinforcement learning selection mechanisms show effects learning. idea behind learning mechanisms continue exploit currently selected heuristic long successful. however, probability promising heuristic successful next step relatively low perturbing reasonable solution combinatorial optimisation problem. paper generalise `simple' selection-perturbation mechanisms success measured fixed period time tau, rather single iteration. present benchmark function necessary learn exploit particular low-level heuristic, rigorously proving makes difference efficient inefficient algorithm. leadingones prove generalised random gradient, generalised greedy gradient hyper-heuristics achieve optimal performance, generalised greedy, although fast, still outperforms random local search. performance former two hyper-heuristics improves number operators choose increases, generalised greedy hyper-heuristic not. experimental analyses confirm results realistic problem sizes shed light best choices parameter tau various situations.",4 "deconvolution layer convolutional layer?. note, want focus aspects related two questions people asked us cvpr network presented. firstly, relationship proposed layer deconvolution layer? secondly, convolutions low-resolution (lr) space better choice? key questions tried answer paper, able go much depth clarity would liked space allowance. better answer questions note, first discuss relationships deconvolution layer forms transposed convolution layer, sub-pixel convolutional layer efficient sub-pixel convolutional layer. refer efficient sub-pixel convolutional layer convolutional layer lr space distinguish common sub-pixel convolutional layer. show fixed computational budget complexity, network convolutions exclusively lr space representation power speed network first upsamples input high resolution space.",4 "sharpened error bounds random sampling based $\ell_2$ regression. given data matrix $x \in r^{n\times d}$ response vector $y \in r^{n}$, suppose $n>d$, costs $o(n d^2)$ time $o(n d)$ space solve least squares regression (lsr) problem. $n$ $d$ large, exactly solving lsr problem expensive. $n \gg d$, one feasible approach speeding lsr randomly embed $y$ columns $x$ smaller subspace $r^c$; induced lsr problem number columns much fewer number rows, solved $o(c d^2)$ time $o(c d)$ space. discuss paper two random sampling based methods solving lsr efficiently. previous work showed leverage scores based sampling based lsr achieves $1+\epsilon$ accuracy $c \geq o(d \epsilon^{-2} \log d)$. paper sharpen error bound, showing $c = o(d \log + \epsilon^{-1})$ enough achieving $1+\epsilon$ accuracy. also show $c \geq o(\mu \epsilon^{-2} \log d)$, uniform sampling based lsr attains $2+\epsilon$ bound positive probability.",4 "asf+ --- eine asf-aehnliche spezifikationssprache. maintaining main aspects algebraic specification language asf presented [bergstra&al.89] extend asf following concepts: exported names asf must stay visible top module hierarchy, asf+ permits sophisticated hiding signature names. erroneous merging distinct structures occurs importing different actualizations parameterized module asf avoided asf+ adequate form parameter binding. new ``namensraum''-concept asf+ permits specifier one hand directly identify origin hidden names decide whether imported module accessed whether important property modified. first case access one single globally provided version; second import copy module. finally asf+ permits semantic conditions parameters specification tasks theorem prover.",4 "propagating uncertainty multi-stage bayesian convolutional neural networks application pulmonary nodule detection. motivated problem computer-aided detection (cad) pulmonary nodules, introduce methods propagate fuse uncertainty information multi-stage bayesian convolutional neural network (cnn) architecture. question seek answer ""can take advantage model uncertainty provided one deep learning model improve performance subsequent deep learning models ultimately overall performance multi-stage bayesian deep learning architecture?"". experiments show propagating uncertainty pipeline enables us improve overall performance terms final prediction accuracy model confidence.",4 "neural networks model venezuelan economy. besides indicator gdp, central bank venezuela generates called monthly economic activity general indicator. priori knowledge indicator, represents sometimes even anticipates economy's fluctuations, could helpful developing public policies investment decision making. purpose study forecasting igaem non parametric methods, approach proven effective wide variety problems economics finance.",4 "framework automated cell tracking phase contrast microscopic videos based normal velocities. paper introduces novel framework automated tracking cells, particular focus challenging situation phase contrast microscopic videos. framework based topology preserving variational segmentation approach applied normal velocity components obtained optical flow computations, appears yield robust tracking automated extraction cell trajectories. order obtain improved trackings local shape features discuss additional correction step based active contours image laplacian optimize example class transformed renal epithelial (mdck-f) cells. also test framework human melanoma cells murine neutrophil granulocytes seeded different types extracellular matrices. results validated manual tracking results.",16 "framework on-line devanagari handwritten character recognition. main challenge on-line handwritten character recognition indian lan- guage large size character set, larger similarity different characters script huge variation writing style. paper propose framework on-line handwitten script recognition taking cues speech signal processing literature. framework based identify- ing strokes, turn lead recognition handwritten on-line characters rather conventional character identification. though framework described devanagari script, framework general applied language. proposed platform consists pre-processing, feature extraction, recog- nition post processing like conventional character recognition ap- plied strokes. on-line devanagari character recognition reduces one recognizing one 69 primitives recognition character performed recognizing sequence primitives. show impact noise removal on-line raw data usually noisy. use fuzzy direc- tional features enhance accuracy stroke recognition also described. recognition results compared commonly used directional features literature using several classifiers.",4 "winning arguments: interaction dynamics persuasion strategies good-faith online discussions. changing someone's opinion arguably one important challenges social interaction. underlying process proves difficult study: hard know someone's opinions formed whether someone's views shift. fortunately, changemyview, active community reddit, provides platform users present opinions reasoning, invite others contest them, acknowledge ensuing discussions change original views. work, study interactions understand mechanisms behind persuasion. find persuasive arguments characterized interesting patterns interaction dynamics, participant entry-order degree back-and-forth exchange. furthermore, comparing similar counterarguments opinion, show language factors play essential role. particular, interplay language opinion holder counterargument provides highly predictive cues persuasiveness. finally, since even favorable setting people may persuaded, investigate problem determining whether someone's opinion susceptible changed all. difficult task, show stylistic choices opinion expressed carry predictive power.",4 "convolutional neural network architectures matching natural language sentences. semantic matching central importance many natural language tasks \cite{bordes2014semantic,retrievalqa}. successful matching algorithm needs adequately model internal structures language objects interaction them. step toward goal, propose convolutional neural network models matching two sentences, adapting convolutional strategy vision speech. proposed models nicely represent hierarchical structures sentences layer-by-layer composition pooling, also capture rich matching patterns different levels. models rather generic, requiring prior knowledge language, hence applied matching tasks different nature different languages. empirical study variety matching tasks demonstrates efficacy proposed model variety matching tasks superiority competitor models.",4 "introduction convolutional neural networks. field machine learning taken dramatic twist recent times, rise artificial neural network (ann). biologically inspired computational models able far exceed performance previous forms artificial intelligence common machine learning tasks. one impressive forms ann architecture convolutional neural network (cnn). cnns primarily used solve difficult image-driven pattern recognition tasks precise yet simple architecture, offers simplified method getting started anns. document provides brief introduction cnns, discussing recently published papers newly formed techniques developing brilliantly fantastic image recognition models. introduction assumes familiar fundamentals anns machine learning.",4 "segmentation classification cine-mr images using fully convolutional networks handcrafted features. three-dimensional cine-mri crucial importance assessing cardiac function. features describe anatomy function cardiac structures (e.g. left ventricle (lv), right ventricle (rv), myocardium(mc)) known significant diagnostic value computed 3d cine-mr images. however, features require precise segmentation cardiac structures. among fully automated segmentation methods, fully convolutional networks (fcn) skip connections shown robustness medical segmentation problems. study, develop complete pipeline classification subjects cardiac conditions based 3d cine-mri. segmentation task, develop 2d fcn introduce parallel paths (pp) way exploit 3d information cine-mr image. classification task, 125 features extracted segmented structures, describing anatomy function. next, two-stage pipeline feature selection using lasso method developed. subset 20 features selected classification. subject classified using ensemble logistic regression, multi-layer perceptron, support vector machine classifiers majority voting. dice coefficient segmentation 0.95+-0.03, 0.89+-0.13, 0.90+-0.03 lv, rv, mc respectively. 8-fold cross validation accuracy classification task 95.05% 92.77% based ground truth proposed methods segmentations respectively. results show pps increase segmentation accuracy, exploiting spatial relations. moreover, classification algorithm features showed discriminability keeping sensitivity segmentation error low possible.",4 "sketch-to-design: context-based part assembly. designing 3d objects scratch difficult, especially user intent fuzzy without clear target form. spirit modeling-by-example, facilitate design providing reference inspiration existing model contexts. rethink model design navigating different possible combinations part assemblies based large collection pre-segmented 3d models. propose interactive sketch-to-design system, user sketches prominent features parts combine. sketched strokes analyzed individually context parts generate relevant shape suggestions via design gallery interface. session progresses parts get selected, contextual cues becomes increasingly dominant system quickly converges final design. key enabler, use pre-learned part-based contextual information allow user quickly explore different combinations parts. experiments demonstrate effectiveness approach efficiently designing new variations existing shapes.",4 "parcellation visual cortex high-resolution histological brain sections using convolutional neural networks. microscopic analysis histological sections considered ""gold standard"" verify structural parcellations human brain. high resolution allows study laminar columnar patterns cell distributions, build important basis simulation cortical areas networks. however, cytoarchitectonic mapping semiautomatic, time consuming process scale high throughput imaging. present automatic approach parcellating histological sections 2um resolution. based convolutional neural network combines topological information probabilistic atlases texture features learned high-resolution cell-body stained images. model applied visual areas trained sparse set partial annotations. show predictions transferable new brains spatially consistent across sections.",4 "tools terminology processing. automatic terminology processing appeared 10 years ago electronic corpora became widely available. processing may statistically linguistically based produces terminology resources used number applications : indexing, information retrieval, technology watch, etc. present tools developed irin institute. take input texts (or collection texts) reflect different states terminology processing: term acquisition, term recognition term structuring.",4 "new point-set registration algorithm fingerprint matching. novel minutia-based fingerprint matching algorithm proposed employs iterative global alignment two minutia sets. matcher considers possible minutia pairings iteratively aligns two sets number minutia pairs exceed maximum number allowable one-to-one pairings. optimal alignment parameters derived analytically via linear least squares. first alignment establishes region overlap two minutia sets, (iteratively) refined successive alignment. alignment, minutia pairs exhibit weak correspondence discarded. process repeated number remaining pairs longer exceeds maximum number allowable one-to-one pairings. proposed algorithm tested fvc2000 fvc2002 databases, results indicate proposed matcher effective efficient fingerprint authentication; fast utilize computationally expensive mathematical functions (e.g. trigonometric, exponential). addition proposed matcher, another contribution paper analytical derivation least squares solution optimal alignment parameters two point-sets lacking exact correspondence.",4 "automatic text extraction character segmentation using maximally stable extremal regions. text detection segmentation important prerequisite many content based image analysis tasks. paper proposes novel text extraction character segmentation algorithm using maximally stable extremal regions basic letter candidates. regions subjected thresholding thereafter various connected components determined identify separate characters. algorithm tested along set various jpeg, png bmp images four different character sets; english, russian, hindi urdu. algorithm gives good results english russian character set; however character segmentation urdu hindi language much accurate. algorithm simple, efficient, involves overhead required training gives good results even low quality images. paper also proposes various challenges text extraction segmentation multilingual inputs.",4 "survey optical character recognition system. optical character recognition (ocr) topic interest many years. defined process digitizing document image constituent characters. despite decades intense research, developing ocr capabilities comparable human still remains open challenge. due challenging nature, researchers industry academic circles directed attentions towards optical character recognition. last years, number academic laboratories companies involved research character recognition increased dramatically. research aims summarizing research far done field ocr. provides overview different aspects ocr discusses corresponding proposals aimed resolving issues ocr.",4 "near-optimal algorithms online matrix prediction. several online prediction problems recent interest comparison class composed matrices bounded entries. example, online max-cut problem, comparison class matrices represent cuts given graph online gambling comparison class matrices represent permutations n teams. another important example online collaborative filtering widely used comparison class set matrices small trace norm. paper isolate property matrices, call (beta,tau)-decomposability, derive efficient online learning algorithm, enjoys regret bound o*(sqrt(beta tau t)) problems comparison class composed (beta,tau)-decomposable matrices. analyzing decomposability cut matrices, triangular matrices, low trace-norm matrices, derive near optimal regret bounds online max-cut, online gambling, online collaborative filtering. particular, resolves (in affirmative) open problem posed abernethy (2010); kleinberg et al (2010). finally, derive lower bounds three problems show upper bounds optimal logarithmic factors. particular, lower bound online collaborative filtering problem resolves another open problem posed shamir srebro (2011).",4 "dr.vae: drug response variational autoencoder. present two deep generative models based variational autoencoders improve accuracy drug response prediction. models, perturbation variational autoencoder semi-supervised extension, drug response variational autoencoder (dr.vae), learn latent representation underlying gene states drug application depend on: (i) drug-induced biological change gene (ii) overall treatment response outcome. vae-based models outperform current published benchmarks field anywhere 3 11% auroc 2 30% aupr. addition, found better reconstruction accuracy necessarily lead improvement classification accuracy jointly trained models perform better models minimize reconstruction error independently.",19 "online convex optimization using predictions. making use predictions crucial, under-explored, area online algorithms. paper studies class online optimization problems external noisy predictions available. propose stochastic prediction error model generalizes prior models learning stochastic control communities, incorporates correlation among prediction errors, captures fact predictions improve time passes. prove achieving sublinear regret constant competitive ratio online algorithms requires use unbounded prediction window adversarial settings, realistic stochastic prediction error models possible use averaging fixed horizon control (afhc) simultaneously achieve sublinear regret constant competitive ratio expectation using constant-sized prediction window. furthermore, show performance afhc tightly concentrated around mean.",4 "cfo: conditional focused neural question answering large-scale knowledge bases. enable computers automatically answer questions like ""who created character harry potter""? carefully built knowledge bases provide rich sources facts. however, remains challenge answer factoid questions raised natural language due numerous expressions one question. particular, focus common questions --- ones answered single fact knowledge base. propose cfo, conditional focused neural-network-based approach answering factoid questions knowledge bases. approach first zooms question find probable candidate subject mentions, infers final answers unified conditional probabilistic framework. powered deep recurrent neural networks neural embeddings, proposed cfo achieves accuracy 75.7% dataset 108k questions - largest public one date. outperforms current state art absolute margin 11.8%.",4 "weak convergence properties constrained emphatic temporal-difference learning constant slowly diminishing stepsize. consider emphatic temporal-difference (td) algorithm, etd($\lambda$), learning value functions stationary policies discounted, finite state action markov decision process. etd($\lambda$) algorithm recently proposed sutton, mahmood, white solve long-standing divergence problem standard td algorithm applied off-policy training, data exploratory policy used evaluate policies interest. almost sure convergence etd($\lambda$) proved recent work general off-policy training conditions, narrow range diminishing stepsize. paper present convergence results constrained versions etd($\lambda$) constant stepsize diminishing stepsize broad range. results characterize asymptotic behavior trajectory iterates produced algorithms, derived combining key properties etd($\lambda$) powerful convergence theorems weak convergence methods stochastic approximation theory. case constant stepsize, addition analyzing behavior algorithms limit stepsize parameter approaches zero, also analyze behavior fixed stepsize bound deviations averaged iterates desired solution. results obtained exploiting weak feller property markov chains associated algorithms, using ergodic theorems weak feller markov chains, conjunction convergence results get weak convergence methods. besides etd($\lambda$), analysis also applies off-policy td($\lambda$) algorithm, divergence issue avoided setting $\lambda$ sufficiently large.",4 "training large scale classifier quantum adiabatic algorithm. previous publication proposed discrete global optimization method train strong binary classifier constructed thresholded sum weak classifiers. motivation cast training classifier format amenable solution quantum adiabatic algorithm. applying adiabatic quantum computing (aqc) promises yield solutions superior achieved classical heuristic solvers. interestingly found using heuristic solvers obtain approximate solutions could already gain advantage standard method adaboost. communication generalize baseline method large scale classifier training. large scale mean either cardinality dictionary candidate weak classifiers number weak learners used strong classifier exceed number variables handled effectively single global optimization. situations propose iterative piecewise approach subset weak classifiers selected iteration via global optimization. strong classifier constructed concatenating subsets weak classifiers. show numerical studies generalized method successfully competes adaboost. also provide theoretical arguments proposed optimization method, minimize empirical loss also adds l0-norm regularization, superior versions boosting minimize empirical loss. conducting quantum monte carlo simulation gather evidence quantum adiabatic algorithm able handle generic training problem efficiently.",18 "electrocardiography separation mother baby. extraction electrocardiography (ecg ekg) signals mother baby challenging task, one single device used receives mixture multiple heart beats. paper, would like design filter separate signals other.",4 "simple proximal stochastic gradient method nonsmooth nonconvex optimization. analyze stochastic gradient algorithms optimizing nonconvex, nonsmooth finite-sum problems. particular, objective function given summation differentiable (possibly nonconvex) component, together possibly non-differentiable convex component. propose proximal stochastic gradient algorithm based variance reduction, called proxsvrg+. algorithm slight variant proxsvrg algorithm [reddi et al., 2016b]. main contribution lies analysis proxsvrg+. recovers several existing convergence results (in terms number stochastic gradient oracle calls proximal operations), improves/generalizes others. particular, proxsvrg+ generalizes best results given scsg algorithm, recently proposed [lei et al., 2017] smooth nonconvex case. proxsvrg+ straightforward scsg yields simpler analysis. moreover, proxsvrg+ outperforms deterministic proximal gradient descent (proxgd) wide range minibatch sizes, partially solves open problem proposed [reddi et al., 2016b]. finally, nonconvex functions satisfied polyak-{\l}ojasiewicz condition, show proxsvrg+ achieves global linear convergence rate without restart. proxsvrg+ always worse proxgd proxsvrg/saga, sometimes outperforms (and generalizes results scsg) case.",12 "random gradient extrapolation distributed stochastic optimization. paper, consider class finite-sum convex optimization problems defined distributed multiagent network $m$ agents connected central server. particular, objective function consists average $m$ ($\ge 1$) smooth components associated network agent together strongly convex term. major contribution develop new randomized incremental gradient algorithm, namely random gradient extrapolation method (rgem), require exact gradient evaluation even initial point, achieve optimal ${\cal o}(\log(1/\epsilon))$ complexity bound terms total number gradient evaluations component functions solve finite-sum problems. furthermore, demonstrate stochastic finite-sum optimization problems, rgem maintains optimal ${\cal o}(1/\epsilon)$ complexity (up certain logarithmic factor) terms number stochastic gradient computations, attains ${\cal o}(\log(1/\epsilon))$ complexity terms communication rounds (each round involves one agent). worth noting former bound independent number agents $m$, latter one linearly depends $m$ even $\sqrt m$ ill-conditioned problems. best knowledge, first time complexity bounds obtained distributed stochastic optimization problems. moreover, algorithms developed based novel dual perspective nesterov's accelerated gradient method.",12 "pooled motion features first-person videos. paper, present new feature representation first-person videos. first-person video understanding (e.g., activity recognition), important capture entire scene dynamics (i.e., egomotion) salient local motion observed videos. describe representation framework based time series pooling, designed abstract short-term/long-term changes feature descriptor elements. idea keep track descriptor values changing time summarize represent motion activity video. framework general, handling types per-frame feature descriptors including conventional motion descriptors like histogram optical flows (hof) well appearance descriptors recent convolutional neural networks (cnn). experimentally confirm approach clearly outperforms previous feature representations including bag-of-visual-words improved fisher vector (ifv) using identical underlying feature descriptors. also confirm feature representation superior performance existing state-of-the-art features like local spatio-temporal features improved trajectory features (originally developed 3rd-person videos) handling first-person videos. multiple first-person activity datasets tested various settings confirm findings.",4 "correspondence insertion as-projective-as-possible image stitching. spatially varying warps increasingly popular image alignment. particular, as-projective-as-possible (apap) warps proven effective accurate panoramic stitching, especially cases significant depth parallax defeat standard homographic warps. however, estimating spatially varying warps requires sufficient number feature matches. image regions feature detection matching fail, warp loses guidance unable accurately model true underlying warp, thus resulting poor registration. paper, propose correspondence insertion method apap warps, focus panoramic stitching. method automatically identifies misaligned regions, inserts appropriate point correspondences increase flexibility warp improve alignment. unlike warp varieties, underlying projective regularization apap warps reduces overfitting geometric distortion, despite increases warp complexity. comparisons recent techniques parallax-tolerant image stitching demonstrate effectiveness simplicity approach.",4 "automatically generating commit messages diffs using neural machine translation. commit messages valuable resource comprehension software evolution, since provide record changes feature additions bug repairs. unfortunately, programmers often neglect write good commit messages. different techniques proposed help programmers automatically writing messages. techniques effective describing changed, often verbose lack context understanding rationale behind change. contrast, humans write messages short summarize high level rationale. paper, adapt neural machine translation (nmt) automatically ""translate"" diffs commit messages. trained nmt algorithm using corpus diffs human-written commit messages top 1k github projects. designed filter help ensure trained algorithm higher-quality commit messages. evaluation uncovered pattern messages generate tend either high low quality. therefore, created quality-assurance filter detect cases unable produce good messages, return warning instead.",4 "detekcja upadku wybranych akcji na sekwencjach obrazów cyfrowych. recent years growing interest action recognition observed, including detection fall accident elderly. however, despite many efforts undertaken, existing technology widely used elderly, mainly flaws like low precision, large number false alarms, inadequate privacy preserving data acquisition processing. research work meets expectations. work empirical situated field computer vision systems. main part work situates area action behavior recognition. efficient algorithms fall detection developed, tested implemented using image sequences wireless inertial sensor worn monitored person. set descriptors depth maps elaborated permit classification pose well action person. experimental research carried based prepared data repository consisting synchronized depth accelerometric data. study carried scenario static camera facing scene active camera observing scene above. experimental results showed developed algorithms fall detection high sensitivity specificity. algorithm designed regard low computational demands possibility run arm platforms. several experiments including person detection, tracking fall detection real-time carried show efficiency reliability proposed solutions.",4 "learning evaluating musical features deep autoencoders. work describe evaluate methods learn musical embeddings. embedding vector represents four contiguous beats music derived symbolic representation. consider autoencoding-based methods including denoising autoencoders, context reconstruction, evaluate resulting embeddings forward prediction classification task.",4 "direction-aware spatial context features shadow detection. shadow detection fundamental challenging task, since requires understanding global image semantics various backgrounds around shadows. paper presents novel network shadow detection analyzing image context direction-aware manner. achieve this, first formulate direction-aware attention mechanism spatial recurrent neural network (rnn) introducing attention weights aggregating spatial context features rnn. learning weights training, recover direction-aware spatial context (dsc) detecting shadows. design developed dsc module embedded cnn learn dsc features different levels. moreover, weighted cross entropy loss designed make training effective. employ two common shadow detection benchmark datasets perform various experiments evaluate network. experimental results show network outperforms state-of-the-art methods achieves 97% accuracy 38% reduction balance error rate.",4 "modeling state software debugging vhdl-rtl designs -- model-based diagnosis approach. paper outline approach applying model-based diagnosis field automatic software debugging hardware designs. present value-level model debugging vhdl-rtl designs show localize erroneous component responsible observed misbehavior. furthermore, discuss extension model supports debugging sequential circuits, given point time, also allows considering temporal behavior vhdl-rtl designs. introduced model capable handling state inherently present every sequential circuit. principal applicability new model outlined briefly use industrial-sized real world examples iscas'85 benchmark suite discuss scalability approach.",4 "supervised texture segmentation: comparative study. paper aims compare four different types feature extraction approaches terms texture segmentation. feature extraction methods used segmentation gabor filters (gf), gaussian markov random fields (gmrf), run-length matrix (rlm) co-occurrence matrix (glcm). shown gf performed best terms quality segmentation glcm localises texture boundaries better compared methods.",4 "facial landmarks detection self-iterative regression based landmarks-attention network. cascaded regression (cr) based methods proposed solve facial landmarks detection problem, learn series descent directions multiple cascaded regressors separately trained coarse fine stages. outperform traditional gradient descent based methods accuracy running speed. however, cascaded regression robust enough regressor's training data comes output previous regressor. moreover, training multiple regressors requires lots computing resources, especially deep learning based methods. paper, develop self-iterative regression (sir) framework improve model efficiency. one self-iterative regressor trained learn descent directions samples coarse stages fine stages, parameters iteratively updated regressor. specifically, proposed landmarks-attention network (lan) regressor, concurrently learns features around landmark obtains holistic location increment. so, rest regressors removed simplify training process, number model parameters significantly decreased. experiments demonstrate 3.72m model parameters, proposed method achieves state-of-the-art performance.",4 "semi-amortized variational autoencoders. amortized variational inference (avi) replaces instance-specific local inference global inference network. avi enabled efficient training deep generative models variational autoencoders (vae), recent empirical work suggests inference networks produce suboptimal variational parameters. propose hybrid approach, use avi initialize variational parameters run stochastic variational inference (svi) refine them. crucially, local svi procedure differentiable, inference network generative model trained end-to-end gradient-based optimization. semi-amortized approach enables use rich generative models without experiencing posterior-collapse phenomenon common training vaes problems like text generation. experiments show approach outperforms strong autoregressive variational baselines standard text image datasets.",19 "3d face reconstruction learning synthetic data. fast robust three-dimensional reconstruction facial geometric structure single image challenging task numerous applications. here, introduce learning-based approach reconstructing three-dimensional face single image. recent face recovery methods rely accurate localization key characteristic points. contrast, proposed approach based convolutional-neural-network (cnn) extracts face geometry directly image. although deep architectures outperform models complex computer vision problems, training properly requires large dataset annotated examples. case three-dimensional faces, currently, large volume data sets, acquiring big-data tedious task. alternative, propose generate random, yet nearly photo-realistic, facial images geometric form known. suggested model successfully recovers facial shapes real images, even faces extreme expressions various lighting conditions.",4 "geometric enclosing networks. training model generate data increasingly attracted research attention become important modern world applications. propose paper new geometry-based optimization approach address problem. orthogonal current state-of-the-art density-based approaches, notably vae gan, present fresh new idea borrows principle minimal enclosing ball train generator g\left(\bz\right) way training generated data, mapped feature space, enclosed sphere. develop theory guarantee mapping bijective inverse feature space data space results expressive nonlinear contours describe data manifold, hence ensuring data generated also lying data manifold learned training data. model enjoys nice geometric interpretation, hence termed geometric enclosing networks (gen), possesses key advantages rivals, namely simple easy-to-control optimization formulation, avoidance mode collapsing efficiently learn data manifold representation completely unsupervised manner. conducted extensive experiments synthesis real-world datasets illustrate behaviors, strength weakness proposed gen, particular ability handle multi-modal data quality generated data.",4 "automatic detection trends dynamical text: evolutionary approach. paper presents evolutionary algorithm modeling arrival dates document streams, time-stamped collection documents, newscasts, e-mails, irc conversations, scientific journals archives weblog postings. algorithm assigns frequencies (number document arrivals per time unit) time intervals produces optimal fit data. optimization trade accurately fitting data avoiding many frequency changes; way analysis able find fits ignore noise. classical dynamic programming algorithms limited memory efficiency requirements, problem dealing long streams. suggests explore alternative search methods allow degree uncertainty achieve tractability. experiments shown designed evolutionary algorithm able reach solution quality classical dynamic programming algorithms shorter time. also explored different probabilistic models optimize fitting date streams, applied algorithms infer whether new arrival increases decreases {\em interest} topic document stream about.",4 "towards reducing multidimensionality olap cubes using evolutionary algorithms factor analysis methods. data warehouses structures large amount data collected heterogeneous sources used decision support system. data warehouses analysis identifies hidden patterns initially unexpected analysis requires great memory computation cost. data reduction methods proposed make analysis easier. paper, present hybrid approach based genetic algorithms (ga) evolutionary algorithms multiple correspondence analysis (mca) analysis factor methods conduct reduction. approach identifies reduced subset dimensions initial subset p p'