• Science of Science
  1. A decision support framework for robust R&D budget allocation using machine learning and optimization

    • Motivation: government R&D budget
    • Research question: decision support systems to maximize the total expected R&D output
    • Proposed method:
      • R&D output prediction model
      • robust optimization technique to hedge against uncertainty
  2. A multi-objective approach for profit-driven feature selection in credit scoring

    • Motivation: feature selection in credit scoring
    • Research gap: standard feature selection only rely on statistical criteria
    • Proposed method:
      • extend the use of profit measures to feature selection
      • develop a multi-objective wrapper framework based on the NSGA-II genetic algorithm with two fitness functions: the Expected Maximum Profit (EMP) and the number of features.
  3. A Novel Method for Topic Linkages Between ScientificPublications and Patents

    • Motivation: understanding the relationships between science and technology.
    • Research Gap: Previous studies on the linkages mainly focus on the analysis of nonpatent references on the front page of patents, or the resulting citation- link networks, but with unsatisfactory performance.
    • Proposed method:
      • a novel statistical entity-topic model (named the CCorrLDA2 model), armed with the collapsed Gibbs sampling inference algorithm, is proposed to discover the hidden topics respectively from the academic articles and patents.
      • a topic linkages construction problem is transformed into the well-known optimal transportation problem after topic similarity is calculated on the basis of symmetrized Kullback–Leibler (KL) divergence.
  4. Data objects and documenting scientific processes: An analysis of data events in biodiversity data papers

    • Motivation: data paper
    • Research gap: Research examining how data papers report data events, such as data transactions and manipulations, is limited.
    • Method & findings:
      • A content analysis was conducted examining the full texts of 82 data papers
      • Data events recorded for each paper were organized into a set of 17 categories.
      • The findings challenge the degrees to which data papers are a distinct genre compared to research articles and they describe data‐centric research processes in a through way.
  5. Examining scientific writing styles from the perspective of linguistic complexity

    • Motivation: Publishing articles in high‐impact English journals is difficult for non‐native English‐speaking scholars
    • Research Gap: uncover the differences in English scientific writing between native English‐speaking scholars (NESs) and NNESs
    • Proposed method:
      • examined the scientific writing styles in English from a two‐fold perspective of linguistic complexity:
      • (a) syntactic complexity, including measurements of sentence length and sentence complexity;
      • (b) lexical complexity, including measurements of lexical diversity, lexical density, and lexical sophistication.
    • Findings: The observations suggest marginal differences between groups in syntactical and lexical complexity.
  6. Modeling the relationship between scientific and bibliographic classification for music

    • Motivation: Scientific classification is an important topic in contemporary knowledge organization discourse
    • Research gap: the nature of the relationships between scientific and bibliographic classifications has not been fully studied.
    • Proposed method:
      • start from the connections between scientific and bibliographic classifications for music
      • Three relationship characteristics are posited: similarity, causation, and time.