DSA Fundamentals: Arrays & Strings - From Theory to LeetCode Practice
Master arrays and strings data structures through theory and practical LeetCode problem solving. Covering static vs dynamic arrays, string immutability, time complexity analysis, and essential algorithms.
Data Structures and Algorithms form the foundation of efficient programming. Today, we'll dive deep into Arrays and Strings - two fundamental data structures that appear in virtually every coding interview and real-world application.
This comprehensive guide combines theoretical understanding with practical problem-solving, featuring solutions to essential LeetCode problems that demonstrate core concepts.
Understanding Arrays: Static vs Dynamic
Array Fundamentals
Arrays are mutable (changeable) collections that store elements in contiguous memory locations, providing efficient random access.
# Array mutability example
A = [1, 2, 3, 4, 5]
A[3] = 1 # Modify element at index 3
print(A) # Output: [1, 2, 3, 1, 5]
Static Arrays vs Dynamic Arrays
Static Arrays:
- Fixed size at creation time
- To change size: copy all elements → create new array → paste elements
- Memory allocated at compile time
Dynamic Arrays:
- Size can grow/shrink during runtime
- Programming language automatically handles expansion
- Memory allocated at runtime (Python lists, JavaScript arrays)
Array Operations Time Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
| Random Access | O(1) | Direct index access |
Search (in operator) | O(n) | Linear scan required |
| Append to end | O(1)* | *Amortized average |
| Pop from end | O(1) | Direct removal |
| Insert (not at end) | O(n) | Shift elements required |
| Delete (not at end) | O(n) | Shift elements required |
| Modify element | O(1) | Direct access and change |
Understanding Strings: Immutability Matters
Strings are immutable - once created, they cannot be changed. Any "modification" creates a new string object.
String vs Array Operations Comparison
| Operation | Array/List | String (Immutable) |
|---|---|---|
| Appending to end | O(1)* Amortized | O(n) |
| Popping from end | O(1) | O(n) |
| Insertion (not from end) | O(n) | O(n) |
| Deletion (not from end) | O(n) | O(n) |
| Modifying an element | O(1) | O(n) |
| Random Access | O(1) | O(1) |
| Searching | O(n) | O(n) |
Key Insight: String immutability means every modification operation creates a new string, leading to O(n) complexity even for simple operations.
Essential LeetCode Problems & Solutions
Let's explore fundamental array and string problems that demonstrate core algorithmic patterns.
1. Valid Sudoku (LeetCode 36)
Problem: Determine if a 9x9 Sudoku board is valid.
Approach: Use hash sets to track seen numbers in rows, columns, and 3x3 boxes.
def isValidSudoku(board):
# Create sets for rows, columns, and 3x3 boxes
rows = [set() for _ in range(9)]
cols = [set() for _ in range(9)]
boxes = [set() for _ in range(9)]
for row in range(9):
for col in range(9):
current_val = board[row][col]
# Skip empty cells
if current_val == '.':
continue
# Calculate box index: (row//3) * 3 + (col//3)
box_index = (row // 3) * 3 + (col // 3)
# Check if number already exists
if (current_val in rows[row] or
current_val in cols[col] or
current_val in boxes[box_index]):
return False
# Add to respective sets
rows[row].add(current_val)
cols[col].add(current_val)
boxes[box_index].add(current_val)
return True
Time Complexity: O(1) - Fixed 9x9 grid = 81 operations Space Complexity: O(1) - Fixed number of sets
Key Concepts:
- Hash Set Usage: Efficient O(1) lookup for duplicates
- Box Index Calculation:
(row//3) * 3 + (col//3)maps 2D coordinates to box number - Early Termination: Return False immediately when duplicate found
2. Longest Consecutive Sequence (LeetCode 128)
Problem: Find the length of the longest consecutive elements sequence in O(n) time.
Approach: Use hash set to identify sequence starts and extend sequences.
def longestConsecutive(nums):
if not nums:
return 0
# Convert to set for O(1) lookup and remove duplicates
num_set = set(nums)
longest = 0
for num in num_set:
# Check if this can be a sequence start
if num - 1 not in num_set:
current_num = num
current_length = 1
# Extend the sequence
while current_num + 1 in num_set:
current_num += 1
current_length += 1
# Update longest sequence found
longest = max(longest, current_length)
return longest
Time Complexity: O(n) - Each number visited at most twice Space Complexity: O(n) - Hash set storage
Key Concepts:
- Sequence Start Identification: Only start counting when
num - 1not in set - Hash Set Optimization: O(1) lookup prevents nested loops
- Duplicate Handling: Set automatically removes duplicates
3. Product of Array Except Self (LeetCode 238)
Problem: Return array where each element is the product of all elements except itself (no division allowed).
Approach: Two-pass algorithm - left products, then right products.
def productExceptSelf(nums):
n = len(nums)
result = [1] * n
# First pass: accumulate products to the left
for i in range(1, n):
result[i] = result[i-1] * nums[i-1]
# Second pass: multiply by products to the right
right_product = 1
for i in range(n-1, -1, -1):
result[i] *= right_product
right_product *= nums[i]
return result
# Example walkthrough:
# nums = [1, 2, 3, 4]
# After first pass: [1, 1, 2, 6] (left products)
# After second pass: [24, 12, 8, 6] (left * right products)
Time Complexity: O(n) - Two passes through array Space Complexity: O(1) - Only using result array (required output)
Key Concepts:
- Two-Pass Strategy: Separate left and right product calculations
- Space Optimization: Use result array to store intermediate values
- No Division Constraint: Handles edge cases like zeros naturally
4. Top K Frequent Elements (LeetCode 347)
Problem: Return k most frequent elements from an array.
Approach 1: Counter + Sorting
from collections import Counter
def topKFrequent(nums, k):
# Count frequencies
counter = Counter(nums)
# Sort by frequency (descending)
sorted_items = sorted(counter.items(), key=lambda x: x[1], reverse=True)
# Extract first k elements
result = []
for i in range(k):
result.append(sorted_items[i][0])
return result
Time Complexity: O(n log n) - Sorting dominates Space Complexity: O(n) - Counter storage
Approach 2: Counter + Bucket Sort (Optimal)
from collections import Counter
def topKFrequent(nums, k):
counter = Counter(nums)
n = len(nums)
# Create buckets for each possible frequency (0 to n)
buckets = [[] for _ in range(n + 1)]
# Place numbers in buckets based on frequency
for num, freq in counter.items():
buckets[freq].append(num)
# Collect k most frequent elements (iterate backwards)
result = []
for freq in range(n, -1, -1):
for num in buckets[freq]:
result.append(num)
if len(result) == k:
return result
return result
Time Complexity: O(n) - Linear bucket sort Space Complexity: O(n) - Bucket array
Key Concepts:
- Bucket Sort Application: Frequency as bucket index
- Trade-off Analysis: Sorting vs bucket sort based on constraints
- Early Termination: Stop when k elements found
5. Group Anagrams (LeetCode 49)
Problem: Group strings that are anagrams of each other.
Approach 1: Sorting as Key
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for s in strs:
# Use sorted string as key
key = ''.join(sorted(s))
groups[key].append(s)
return list(groups.values())
Time Complexity: O(n * m log m) where n = number of strings, m = average string length Space Complexity: O(n * m) - Storage for groups
Approach 2: Character Count as Key (Optimal)
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for s in strs:
# Create character count array
char_count = [0] * 26
for char in s:
char_count[ord(char) - ord('a')] += 1
# Use tuple of counts as key (arrays aren't hashable)
key = tuple(char_count)
groups[key].append(s)
return list(groups.values())
Time Complexity: O(n * m) - Linear in string length Space Complexity: O(n * m) - Storage for groups
Key Concepts:
- Anagram Property: Same character frequencies
- Key Generation: Sorted string vs character count
- Hashable Keys: Tuple conversion for dictionary keys
6. Two Sum (LeetCode 1)
Problem: Find two numbers that add up to target.
Approach: Hash map for complement lookup.
def twoSum(nums, target):
seen = {} # value -> index mapping
for i, num in enumerate(nums):
complement = target - num
if complement in seen:
return [seen[complement], i]
seen[num] = i
return [] # No solution found
Time Complexity: O(n) - Single pass Space Complexity: O(n) - Hash map storage
Key Concepts:
- Complement Strategy: target - current = needed value
- Hash Map Optimization: O(1) lookup vs O(n) nested loops
- Index Tracking: Store positions for result
7. Valid Anagram (LeetCode 242)
Problem: Determine if two strings are anagrams.
Approach 1: Sorting
def isAnagram(s, t):
return sorted(s) == sorted(t)
Time Complexity: O(n log n) - Sorting Space Complexity: O(1) - Ignoring sort space
Approach 2: Character Frequency (Optimal)
from collections import Counter
def isAnagram(s, t):
return Counter(s) == Counter(t)
# Or manual counting:
def isAnagram(s, t):
if len(s) != len(t):
return False
char_count = {}
# Count characters in s
for char in s:
char_count[char] = char_count.get(char, 0) + 1
# Subtract characters in t
for char in t:
if char not in char_count:
return False
char_count[char] -= 1
if char_count[char] == 0:
del char_count[char]
return len(char_count) == 0
Time Complexity: O(n) - Linear scan Space Complexity: O(k) where k = unique characters
8. Contains Duplicate (LeetCode 217)
Problem: Check if array contains any duplicates.
Approach 1: Hash Set
def containsDuplicate(nums):
seen = set()
for num in nums:
if num in seen:
return True
seen.add(num)
return False
Approach 2: Set Length Comparison
def containsDuplicate(nums):
return len(set(nums)) != len(nums)
Time Complexity: O(n) - Single pass or set creation Space Complexity: O(n) - Hash set storage
Key Concepts:
- Early Termination: Return immediately when duplicate found
- Set Properties: Automatic duplicate removal
- Length Comparison: Elegant one-liner solution
Core Algorithm Patterns from Today's Problems
1. Hash-Based Solutions (Primary Pattern)
- When to use: Fast lookups, duplicate detection, complement finding
- Examples: Two Sum, Valid Sudoku, Contains Duplicate, Longest Consecutive Sequence
- Key insight: Trade O(n) space for O(1) lookup time
- Implementation: Hash Sets for membership, Hash Maps for key-value pairs
2. Frequency Analysis (Primary Pattern)
- When to use: Anagrams, character counting, top-k problems
- Examples: Group Anagrams, Valid Anagram, Top K Frequent Elements
- Tools: Counter, hash maps, bucket sort
- Optimization: Linear time frequency-based solutions instead of sorting
3. Multi-Pass Array Processing
- When to use: Complex calculations requiring multiple scans
- Example: Product of Array Except Self (left pass + right pass)
- Benefit: Maintain O(n) time while avoiding division or complex logic
- Pattern: Accumulate information in one direction, then process in reverse
4. Set Operations for Uniqueness
- When to use: Removing duplicates, finding unique elements
- Examples: Longest Consecutive Sequence (duplicate removal), Contains Duplicate
- Advantage: Automatic duplicate handling with O(1) operations
- Common trick: Compare
len(set(array))vslen(array)for duplicate detection
Performance Optimization Tips
1. Choose Right Data Structure
# Slow: List membership testing
if item in my_list: # O(n)
# Fast: Set membership testing
if item in my_set: # O(1)
2. Minimize String Operations
# Slow: String concatenation in loop
result = ""
for char in chars:
result += char # O(n) each iteration
# Fast: Join operation
result = ''.join(chars) # O(n) total
3. Early Termination
# Stop as soon as condition is met
for item in items:
if condition(item):
return True # Don't continue unnecessary iterations
return False
4. Space-Time Trade-offs
# Time-optimized (more space)
seen = set(nums) # O(n) space
return len(seen) != len(nums) # O(n) time
# Space-optimized (more time)
nums.sort() # O(1) space (in-place)
for i in range(1, len(nums)): # O(n log n) time
if nums[i] == nums[i-1]:
return True
Common Pitfalls and Solutions
1. Index Out of Bounds
# Dangerous
for i in range(len(arr)):
if arr[i+1] > arr[i]: # Error when i = len(arr)-1
# Safe
for i in range(len(arr) - 1):
if arr[i+1] > arr[i]:
2. Modifying List While Iterating
# Dangerous
for item in my_list:
if condition(item):
my_list.remove(item) # Changes list size during iteration
# Safe
my_list = [item for item in my_list if not condition(item)]
3. String Immutability Oversight
# Inefficient for large strings
def reverse_string(s):
result = ""
for char in reversed(s):
result += char # O(n²) due to string immutability
return result
# Efficient
def reverse_string(s):
return s[::-1] # O(n) built-in operation
Conclusion
Arrays and strings form the foundation of algorithmic problem-solving. Key takeaways:
Core Concepts Mastered:
- Array mutability vs string immutability and their performance implications
- Static vs dynamic arrays and when to use each
- Time complexity analysis for common operations
- Hash-based optimizations for O(1) lookups
Essential Patterns Mastered Today:
- Hash-based lookups (Two Sum, Valid Sudoku, Contains Duplicate)
- Frequency analysis (anagrams, Top K problems)
- Multi-pass processing (Product Except Self)
- Set operations (duplicate detection, uniqueness)
Problem-Solving Strategies:
- Choose appropriate data structures based on operation requirements
- Consider space-time trade-offs for optimization
- Use early termination when possible
- Leverage built-in functions for common operations
Next Steps:
Building on this foundation, future topics will cover:
- Linked Lists and Pointers
- Stack and Queue Applications
- Tree and Graph Traversals
- Dynamic Programming Patterns
The problems covered here represent fundamental patterns that appear across technical interviews and real-world applications. Master these concepts, and you'll have a solid foundation for tackling more complex algorithmic challenges.
References:
- Greg Hogg DSA Algorithm YouTube Channel
- Algorithm Design Manual by Steven Skiena
- Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein
- Grokking Algorithms by Aditya Bhargava
This comprehensive guide combines theoretical understanding with practical problem-solving, providing a solid foundation for data structures and algorithms mastery.