Data Science Jobs
Data Science Jobs
June 1, 2025 at 07:43 AM
🧠 *Top Data Science Interview Questions & Answers* 1️⃣ *What is the difference between structured and unstructured data?* - *Structured data* is organized, with a fixed format (tables, rows, columns). - *Unstructured data* has no predefined format (text, images, videos). 2️⃣ *What is multicollinearity? How to remove it?* - Multicollinearity occurs when features are highly correlated, causing redundancy and instability in models. - Remove it by: • Dropping correlated variables • Using dimensionality reduction (e.g., PCA) • Applying regularization methods (Ridge, Lasso) 3️⃣ *Which algorithms do you use to find the most correlated features?* - Correlation matrix (Pearson, Spearman) - Feature importance from tree-based models (Random Forest, XGBoost) - Mutual information scores 4️⃣ *Define entropy.* - Entropy measures randomness or uncertainty in data. - In decision trees, it helps decide the best feature to split by measuring impurity. 5️⃣ *What is the workflow of Principal Component Analysis (PCA)?* - Standardize data → Compute covariance matrix → Calculate eigenvectors & eigenvalues → Select top components → Transform data to new feature space. 6️⃣ *Applications of PCA beyond dimensionality reduction?* - Noise reduction - Visualization of high-dimensional data - Feature extraction - Data compression 7️⃣ *What is a Convolutional Neural Network (CNN)? Explain its working.* - CNN is a deep learning model mainly for image data. - It uses convolutional layers to extract spatial features, pooling layers to reduce size, and fully connected layers for classification. 💡 *Double tap ❤️ if you found this helpful!*
❤️ 6

Comments